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ABSTRACT 


Most  database  system  designs  and  implementations  are  limited  to  single  language 
(monolingual)  and  single  model  (monomodel)  database  systems.  No  one  database  language 
and  model  can  meet  every  data  processing  need.  As  a  result,  diverse  (heterogeneous)  data¬ 
base  management  systems  are  used  in  a  large  organization.  This  approach  results  in  more 
processing  cost  and  less  data  sharing.  Therefore,  there  is  a  growing  need  for  a  solution  to 
the  processing  cost  and  data  sharing  problems  of  heterogeneous  database  systems.  One  so¬ 
lution  is  a  multimodel  and  multilingual  database  system  (MM  &  MLDS)  as  implemented 
on  the  Multibackend  Database  Supercomputer  (MDBS).  This  solution  can  support  an  ap¬ 
propriate  language  and  model  corresponding  to  each  set  of  application  requirements.  In  ad¬ 
dition,  this  solution  lends  itself  to  data  sharing  by  not  only  storing  all  the  data  in  one  kernel 
data  model  form,  but  also  by  allowing  cross-model  data  accessing. 

The  goal  of  this  thesis  is  to  make  the  MDBS  viable  as  a  kernel  database  management 
system.  First,  we  introduce  the  multimodel,  multilingual,  multibackend  database  super¬ 
computer  which  has  been  found  as  a  solution  towards  heterogeneous  database  sharing.  Sec¬ 
ond,  we  increase  the  utilization  of  MDBS  by  introducing  the  system  generation  software 
management  tool.  Finally,  we  ease  the  learning  curve  associated  with  MDBS  by  document¬ 
ing  the  entire  system  structure  including  all  the  files  needed  to  compile  and  run  the  system. 
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L  THE  INTRODUCTION 


A.  MY  MOTIVATIONS  FOR  THE  THESIS  TOPIC 

Over  the  past  few  decades  there  has  been  a  rapid  increase  in  storage  and  processing 
capacity  of  computers  (Elmasri,  1989,  p.  637).  While  the  capacity  increases,  the  cost  for  the 
same  amount  of  capacity  decreases.  One  wuuld  expect  the  overall  performance  for 
computer  programs  and  systems  to  improve.  This  has  been  the  case  for  the  general-puipose 
computer;  however,  one  application  for  the  general-purpose  computer  has  not  kept  up  with 
the  growth.  This  application  is  called  database  management.  In  the  following  references  we 
note  that  database  systems  have  unique  hardware  needs  that  are  not  normally  built  into 
general-purpose  computers. 

...the  overall  performance  of  database  systems  is  still  a  great  concern.  [There  is 
a]. ..mismatch  between  access  speeds  of  secondary  storage  devices  and  those  of  central 
processing  units  leading  to  inevitable  delays  in  processing  large  volumes  of  data 
stored  on  disk.  Compared  to  the  advances  in  speeds  of  processors...the  rate  of  increase 
of  speeds  of  secondary  storage  devices  (mainly  disks)  have  been  slow.  (Elmasri,  1989, 
p.  637-8) 

Typical  database  management  operations  require  the  processing  of  90-95  percent 
of  related  data  for  the  purpose  of  producing  5-10  percent  of  useful  information.... 
(Banerjee,  1979,  p.  137) 

...data  is  not  stored  at  the  place  where  the  data  is  processed.  To  ‘stage’  the  data 
in  the  main  memory  for  processing  is  very  time  consuming.  It  often  ties  up  important 
resources  of  a  computing  system  such  as  communication  lines,  channels,  and  data 
buses.  Ideally,  data  should  be  processed  at  the  place  where  they  are  stored  to  avoid 
spending  time  in  moving  data  between  the  main  memory  and  the  secondary  memories. 
(Su,  1979,  p.  430) 

Methods  to  improve  the  performance  of  database  systems  can  be  considered  in  two 
main  categories:  software  and  hardware.  The  software-only  approach  uses  indices,  file- 


Over  the  past  20  years,  data  storage  capacity  for  a  moving  head  disk  has  doubled  every  three 
years  and  Dynamic  Random  Access  Memory  storage  density  has  quadrupled  every  3  years  (Hen- 
nessy,  1990,  p.  17). 

Growth  of  Performance  over  the  last  20  years  ranged  from  18%  to  35%  depending  upon  com¬ 
puter  class  (Hennessy,  1990,  p.  3). 
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organization  techniques,  intra-data  pointers,  data  compaction,  directories,  cross-reference 
pointers,  computed  addressing,  to  name  a  few,  which  alleviate  the  speed  problem  to  a 
certain  extent  (Su,  1979,  p.  430).  Unfortunately  the  software-only  approach  brings  with  it 
additional  cost.  The  cost  and  overhead  are  the  requirement  for  more  storage  and  increased 
processing  time.  The  overhead  includes,  for  example,  maintenance  of  indices,  computation 
of  pointers,  and  performance  of  compaction  algorithms. 

A  second  approach  is  the  hardware  approach.  Several  database  machines  have  been 
proposed  over  the  last  two  decades.  The  following  is  a  list  of  some  of  the  proposed 
hardware  concepts:  associative  memory  (DeWitt,  1979),  backend  computers,  cellular  logic 
(Schuster,  1979),  cross-bar  network  (DeWitt,  1979),  function-specialized  (Baneijee, 
1979),  logic-per-track  (first  introduced  by  D.  L.  Slotnick),  (Su,  1979),  (DeWitt,  1979), 
(Schuster,  1979),  multiprocessor  systems  (DeWitt,  1979),  and  special  purpose  processors 
(Su,  1979),  (Banerjee,  1979).  Few  have  been  successfully  built  and  even  fewer  have 
become  commercial  products. 

The  ideal  database  computer  should  use  the  optimal  software  techniques  combined 
with  the  optimal  hardware  designs.  The  ideal  database  computer  has  not  been  budt.  One 
database  machine,  known  as  the  Multibackend  Database  Supercomputer  (MDBS),  is  at  the 
Naval  Postgraduate  School’s  Laboratory  for  Database  Systems  Research.  This  system 
which  started  development  in  the  early  1980’s,  is  fundamentally  based  upon  the  Database 
Computer  (DBC)  described  in  (Baneijee,  1979).  This  system  not  only  addresses  a  solution 
to  the  problem  of  processing  large  amounts  of  data  efficiently,  but  is  also  a  multilingual  and 
multimodel  database  system. 

What  is  the  significance  of  a  multilingual  and  multimodel  database  system?  Consider 
large  organizations  which  often  use  multiple  database  models  and  languages  to  meet  their 
database  needs.  Three  typical  databases  would  be  for  inventory,  personnel,  and  product 
assembly.  An  inventory  (consisting  of  suppliers,  supplies,  and  orders)  can  best  be  modeled 
with  a  network  database  system.  A  personnel  database  can  best  be  maintained  with  a 
relational  database  system.  And  a  product  assembly  (consisting  of  products,  assemblies. 
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sub-assemblies,  parts,  and  so  on)  can  best  be  supported  by  a  hierarchial  database  system. 
In  addition  to  the  different  models  and  languages,  there  are  often  more  than  one  database 
with  the  same  information.  One  example  is  a  personal  database  for  a  social  roster  on  one 
machine  and  a  payroll  database  on  another  machine.  Both  databases  have  a  person’s  name 
and  address  among  other  information. 

Some  results  of  having  different  models  and  languages  are  as  follows.  First  there  is  a 

lack  of  data  sharing.  Since  the  heterogeneous  databases^  cannot  communicate  between 
themselves,  all  information  must  be  maintained  by  each  system  even  if  the  same  data  exists 
in  another  database  system.  The  lack  of  data  sharing  increases  the  storage  capacity  needed 
to  maintain  the  separate  databases.  Second,  the  duplicate  information  stored  leads  to  data 
inconsistency  problems.  If  one  updates  a  database,  then  another  database  with  the  same 
information  will  be  incorrect  until  it  is  also  updated.  Third,  personnel  who  know  how  to  use 
one  database  model  and  language  have  to  be  retrained  to  use  another  model  effectively. 
This  specialization  makes  it  more  costly  (and  harder)  to  move  personnel  between  jobs, 
since  it  requires  knowledge  of  a  database  model  with  which  the  personnel  are  not  familiar. 
This  problem  affects  the  total  cost  and  effectiveness  of  maintenance  and  support  of 
database  systems. (Hsiao,  1992,  pp.  127-132),  (Chung,  1990,  p.  70) 

The  fact  that  different  models  exist  does  not  mean  that  we  should  look  for  a  single  or 
universal  data  model  and  language  so  that  we  can  convert  all  present  and  future  databases 
into  one  model  and  language  just  to  cut  the  cost  of  maintenance  and  support.  At  the  same 
tune  we  should  not  ignore  the  problem.  One  solution  to  the  heterogeneous  data  sharing 
problem  was  proposed  by  (Hsiao,  1989).  This  solution  has  been  partially  implemented  on 
the  Multibackend  Database  Supercomputer  (MDBS)  at  the  Naval  Postgraduate  School 
(NPS). 

Recognizing  that  no  one  database  model  or  language  can  meet  all  database  needs,  this 
solution  supports  all  database  languages.  All  data  is  stored  using  a  kernel  data  language  (the 

Heterogeneous  in  the  sense  that  there  are  multiple  data  models.This  is  not  to  be  confused  with 
different  dialects  of  the  same  model  and  language.  Nor  does  refer  to  different  hardware. 
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Attribute  Based  Data  Language)  and  requires  that  each  database  language  interface  store 
its  data  using  the  kernel  data  language.  This  design  was  chosen  since  it  reduces  the  cross¬ 
model  data  translations  from  n^  -  n  to  n  - 1.  Where  n  is  the  number  of  database  languages 
implemented  by  the  system.  (Hsiao,  1992) 

B.  THE  SCOPE  AND  OBJECTIVES  OF  THE  THESIS 

My  objective  of  the  thesis  is  three  fold.  First,  to  introduce  the  multimodel, 
multilingual,  multibackend  database  supercomputer  which  has  been  found  as  a  solution 
towards  heterogeneous  database  sharing.  Second,  to  introduce  the  system  generation 
software  management  tool.  And  finally  to  document  the  structure  of  the  MDBS  files  needed 
to  compile  and  run  the  system. 

There  should  be  an  automatic  way  to  generate,  manage,  and  control  various  versions 
of  a  database  system  with  little  or  no  human  intervention  and  support.  This  would  be  good 
for  the  systems  administrator  or  developer,  but  is  of  little  concern  to  the  user.  Since  there 
are  few  parallel  and  distributed  database  systems  in  existence,  there  is  a  lack  of 
methodology  in  developing  these  systems.  On  the  other  hand,  most  contemporary  database 
computers  are  not  parallel;  thus  they  do  not  have  multiple  backends.  There  is  little  need  to 
distribute  multiple  copies  of  the  same  code  to  multiple  backends  on  contemporary  database 
computers,  since  they  do  not  have  backends.  Methods  for  version  control  and 
configuration  management  of  system  software  for  the  scalable  and  parallel  database 
computer  must  be  developed  to  simplify  the  design  process  and  final  usability  of  the 
system.  Thus  methods  of  design  and  implementation  of  a  system  generation  management 
tool  for  a  specific  parallel  and  multibackend  system  must  be  developed.  In  this  thesis,  I  will 
demonstrate  the  feasibility  of  constructing  such  a  tool  on  our  research  database  computer, 
MDBS. 

C.  ORGANIZATION  OF  THE  THESIS 

In  Chapter  II  of  this  thesis,  I  provide  an  overview  of  the  hardware  and  software  of  the 
Multilingual,  Multimodel,  Multibackend  Database  Supercomputer  (MDBS).  In  Chapter 
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ni,  I  describe  what  system  generation  software  is  and  what  functions  it  should  provide.  In 
Chapter  IV,  I  describe  how  my  system  generation  software  is  implemented  and  how  the 
code  is  organized.  Areas  for  future  study  are  addressed  in  Chapter  V.  The  appendices 
contain  much  information  about  the  MDBS  software  which  has  not  been  put  into  one  place 
before. 
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n.  A  DESCRIPTION  OF  THE  MDBS 

A.  THE  ARCHITECTURE 

There  are  two  aspects  of  the  MDBS  which  make  it  a  powerful  database  computer.  The 
first  aspect  is  the  multi-lingual/multi-model  capability  and  the  second  aspect  is  scalability 
with  parallel  backends. 

As  described  in  the  previous  chapter,  MDBS  uses  the  Attribute  Based  Data  Language 
(ABDL)  as  the  kernel  or  common  data  language  to  store  all  databases.  Figure  1  shows  the 
relationship  between  various  database  users  and  how  a  database  is  viewed  and  stored.  There 

are  five^  language  interfaces  on  the  system,  attribute  based,  network,  hierarchial,  relational, 
and  object-oriented.  Each  the  language  interface  uses  a  schema  to  view  a  particular 

database  (which  it  can  access).  Limited  cross-model^  capability  exists  on  MDBS,  relational 
to  hierarchial  and  relational  to  object-oriented. 

The  parallel  processing  capabilities  of  MDBS  were  based  upon  the  Database 
Computer  (DBG)  described  in  (Baneijee,  1979)  and  roughly  depicted  in  Figure  2.  The 
Database  Computer  had  many  optimizations  built-in  to  make  it  as  quick  as  possible  in 
processing  database  queries.  Some  optimizations  which  DBC  used  were  (1)  modified 
moving  head  disk  controllers  which  could  data  from  all  each  tracks  in  one  cylinder 
simultaneously,  (2)  partitioning  to  evenly  distribute  the  base  data  to  ensure  maximum 
number  of  heads  (or  tracks)  in  each  cylinder  are  being  used,  (3)  arranging  then  processing 
queries  in  conjunctive  normal  form,  and  (4)  clustering  of  data.  The  DBC  architecture  broke 
query  processing  into  several  stages  and  placed  single  processors  at  four  stages  and 
multiple  processors  at  the  base  data  and  where  the  indices  are  stored  and  processed.  The 
four  units  with  single  processors  are  the  database  command  8c  control  processor  (DBCCP), 
keyword  transformation  unit  (KXU),  index  transformation  unit  (DCU),  and  security  filter 

A  sixth  one,  functional,  is  in  the  process  of  being  added. 

^  A  database  user  can  access  (retrieve,  update,  delete,  or  insert)  one  database  using  a  different  lan¬ 
guage  than  was  used  to  create  and/or  update  the  database.  For  example,  a  hierarchial  database  can 
be  accessed  by  the  relational  language  interface  through  a  one  time  schema  transformation. 
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Figure  1.  The  Multimodel,  Multilingual  and  Cross-Model  Accessing  Capability 


processor  (SFP).  Multiple  processors  were  placed  the  main  memory  (MM),  the  structure 
memory  (SM)  and  the  structure  memory  information  processor  (SMIP).  The  MM  is  a 
moving  head  disk  drive  with  a  controller  modified  to  simultaneously  read  and  process  data 
from  all  tracks  in  one  cylinder.  The  SM  can  be  either  bubble  memory  or  charged  coupled 
devices  (CCD’ s).  There  is  one  processor  per  track  in  the  SM  and  one  processor  per  memory 
unit  in  the  SMIP.  These  processors  worked  in  parallel  on  various  stages  of  query 
processing.  The  structure  loop  was  designed  to  limit  main  memory  search  space,  determine 
authorized  records  for  accesses,  and  for  clustering  records  received  for  insertion  into  the 
database.  Two  advantages  of  interest  of  the  DBC  are  (1)  an  estimated  speed  up  of  over  80 
times  that  of  a  database  system  implemented  on  a  general  purpose  computer  (single  CPU, 
single  disk  machine)  and  (2)  the  method  of  storing  data  in  partitions  was  intended  to 

segregate  data  as  it  was  stored  for  a  multi-level  security  database^. 

For  various  reasons,  the  DBC  was  never  built.  In  1980  Dr.  Hsiao  decided  to  use  “off 
the  shelf’  equipment  to  implement  a  simplified  version  of  DBC.  This  version  only  used  one 
processor  (per  controller  and  per  backend)  to  perform  the  tasks  that  had  been  designed  to 
be  performed  by  the  multiple  processors  of  the  DBC  design.  Since  the  disk  controllers  were 
not  modified,  the  data  could  not  be  processed  as  it  was  read  from  the  disk.  Without  the 
multiple  processors  on  each  backend  there  was  a  significant  loss  in  the  speed-up  that  would 
have  occurred  with  the  DBC  design.  This  system  was  modified  and  expanded  and  then 
moved  to  the  Naval  Postgraduate  School  in  1983.  It  is  now  known  as  the  Multibackend 
Database  Supercomputer  (MDBS). 

One  strength  of  MDBS  is  the  method  used  to  store  database  data  on  the  backends.  As 
a  database  is  loaded,  it  is  divided  into  clusters  of  similar  data  descriptors'^.  Using  clusters. 

The  muM-level  security  topic  is  not  related  to  this  thesis,  but  is  very  relevant  to  a  Dq)artment  of 
Defense  (DOD)  plication.  The  access  authorization  for  a  user  or  process  would  determine  which 
partitions  of  data  were  within  the  authorized  clearance  level  for  that  user  or  process.  A  good  design 
for  storing  classified  data  on  the  DBC  would  not  allow  reading  any  unauthorized  data  from  the  data 
disk. 

A  descriptor  is  an  attribute  value  or  range  of  values  and  is  used  to  group  the  data. 
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Figure  2.  Architecture  of  the  DBC 
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the  base  data^  is  spread  evenly  across  the  backends.  This  even  division  of  data  across  the 
backends  maximizes  parallel  query  processing  on  the  backends. 

A  second  strength  of  MDBS  lies  in  its  ability  to  easily  change  the  database  storage 
capacity  and  the  amount  of  parallelism.  This  can  be  done  by  changing  the  number  of 
backends  in  the  configuration.  As  a  database  grows  over  time,  reaching  the  capacity  of  an 
existing  system,  the  traditional  solution  is  to  get  a  bigger  and  faster  computer.  Once  the 
replacement  system  is  found  then  (1)  a  database  system  must  be  installed,  (2)  the  data  must 
be  reloaded  on  the  newer  system,  and  (3)  the  existing  queries  must  be  rewritten  and  tested 
for  the  new  system.  Solving  the  database  growth  problem  on  the  MDBS  is  much  simpler. 
To  enlarge  the  MDBS,  one  buys  more  backend  machines,  backs  up  the  database,  adds  the 
new  backends  to  the  configuration,  and  reloads  the  database  on  the  new  configuration.  By 
reloading  the  database,  its  data  is  now  redistributed  over  aU  backends  for  the  new 
configuration  and  is  ready  for  use. 

With  the  database  distributed  across  the  backends,  each  time  the  controller  sends  a 
query  to  the  backends,  each  backend  does  some  work  in  parallel.  Thus  the  parallel 
processing  capabilities  are  achieved  on  MDBS  by  the  multiple  backend  capability  and  to  a 

lesser  extent  by  having  separate  disks  for  base  data,  meta-data^,  and  paging. 

B.  ITS  HARDWARE  CONFIGURATION 

A  fairly  detailed  description  of  the  multibackend  database  supercomputer  (MDBS) 
can  be  found  in  (Hsiao,  1991).  Figure  3  shows  an  overview  of  the  configuration.  At  the 

time  of  this  writing,  the  MDBS  is  being  ported  to  a  new  controller^  and  backend^. 

Base  data  is  stcpred  as  attribute-value  pairs  using  the  Attribute  Based  Data  Model  (ABDM). 

Metadata  is  information  about  the  base  data.  To  access  the  base  rfara  to  find  the  ^piopriate  clus- 
teifs).  The  backend  can  then  directly  access  the  base  data  using  the  cluster  identifier(s).  For  more 
details  see  (Hsiao  1991). 

The  new  controller  is  a  Sun  model  4/110  workstation  called  dbl  1  which  has  the  Sparc  4  RISC 
architecture  with  8  Megabytes  of  random  access  memory  (RAM)  and  one  373  Megabyte  hard 
drive. 

*■  The  two  backends,  dbl2  and  dbl3,  are  Sun  model  4/280  woikstations  which  each  have  Sparc  4 
RISC  architecture  widi  16  Megabytes  of  RAM,  one  96  Megabyte  disk  for  paging,  one  96  MB  disk 
for  meta-data,  and  one  1000  MB  hard  disk  for  data. 
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Meta  data  disk 


Paging  disk 

Figure  3.  The  Multibackend  Database  Supercomputer  (MDBS) 


1.  Network 

The  local  area  network  (LAN)  consists  of  the  Ethernet  cable,  the  backplanes  and 
their  transceivers.  There  is  one  LAN  for  the  MDBS.  The  controller,  dbS,  acts  as  the 
gateway  to  the  local  computer  science  LAN.  Each  backplane  has  an  Intel  80186 
microprocessor  and  128-Kilobytes  of  real  memory.  The  80186  processors  work  in 
conjunction  with  another  processor,  an  Intel  82586,  which  supports  the  transmission 
control  protocol/intemet  protocol  (TCP/IP).  The  Ethernet  cable  has  a  transmission  rate  of 


10  Mega  bps.  The  Ethernet  allows  messages  to  collide,  but  the  user  defined  protocol  (UDP) 
software  helps  make  unreliable  broadcasting  reliable. 

2.  Controller 

The  controller  supports  the  user  interface  as  well  as  communications  with  each 
backend.  The  controller  is  an  Integrated  Solutions  (ISIV)  model  V24S  workstation  with  a 
16  MHz  68020  microprocessor,  4MB  ram,  a  96  Megabyte  paging  disk  to  support  virtual 
memory,  96  Megabyte  disk  for  system  and  user  files,  and  a  tape  drive.  The  controller  is 
connected  to  the  backends  via  one  backplane  and  to  the  local  computer  science  department 
network  via  a  second  backplane.  When  the  database  system  is  executing,  the  controller 
communicates  with  the  backends  via  sockets.  The  magnetic  tape  drive  on  the  controller  can 
be  used  for  mass  loading  of  a  database,  back-ups  of  a  database,  and  normal  back-ups  of  all 
existing  files. 

3.  Backends 

Although  MDBS  is  capable  of  supporting  many  backends,  there  are  three 

backends^  in  the  network.  Each  backend  is  an  Integrated  Solutions  (ISFV)  model  V24S 
workstation  with  a  16  MHz  Motorola  68020  microprocessor,  and  three  disk  drives.  One  96 
Megabyte  Winchester-type  disk  drive  is  used  for  paging,  another  96  Megabyte  disk  is  used 
to  store  the  meta-data.  The  base  data  is  stored  on  a  500  Megabyte  moving-head  disk. 
Communications  with  the  controller  are  handled  by  the  backplane  which  communicates 
with  the  controller  and  other  backends  via  the  Ethernet  Each  backend  maintains  the  meta¬ 
data  table  for  the  entire  database.  The  backends  for  the  database  computer  are  not  required 
to  be  identical  machines. 


The  system  had  as  many  as  eight  backends.  With  two  backends  on  loan  and  three  suffering  from 
old  age,  there  are  only  three  which  can  be  used. 


12 


C.  ITS  SOFTWARE  CONFIGURATION 


The  MDBS  uses  UNDC^®  Berkeley  Software  Distribution  (BSD)  4.3  operating 
system.  All  of  the  software  is  stored  and  compiled  on  the  controller.  When  the  backend 
code  is  compiled  it  must  be  distributed  from  the  controller  to  each  backend.  To  make  the 
compilation  and  distribution  process  simpler,  identical  CPU  and  operating  systems  are  used 
for  contoller-backend  combinations  under  development.  On-line  backups  of  code  can  be 
stored  on  the  backends.  To  support  MDBS,  there  are  six  processes  which  run  on  the 

controller^  ^  and  six  other  processes  which  run  on  each  backend  in  parallel. 

1.  Controller  Processes 

The  six  processes  on  the  controller  are  the  user  interface  (TI^^),  controller  get 
(CGET),  controller  put  (CPUT),  request  preparation  (REQP),  post  processor  (PP),  and 
insert  information  generator  (IIG).  A  simplified  diagram  of  how  the  six  processes 
interrelate  is  in  Figure  4. 

The  process-by  the  far  the  largest--is  TI  which  implements  the  data  language 
interfaces  described  at  the  beginning  of  this  chapter  and  the  cross  model  capabilities  of  the 
MDBS.  Thi§  process  maps  each  data  model  and  language  to  the  kernel  data  language 
(KDL)  and  kernel  data  model  (KDM).  On  MDBS,  kernel  data  is  stored  using  the  attribute 

based  data  model  (ABDM)^^  and  attribute  data  language  (ABDL).  For  each  data  language, 
there  are  four  software  modules  which  make  up  the  user  interface  (illustrated  in  the  shaded 


UNIX  is  a  registered  trademaik  of  AT&T  Bell  Laboratories,  Inc. 

Appendix  A  contains  a  directory  tree  of  all  the  files  in  the  MDBS.  Specifically,  the  controUa- 
jnocesses  (cget,  cput,  iig,  pp,  reqp,  ti)  are  located  in  the  direcuxy  Version_Name/CNTRL/  and  are 
compiled  from  source  code  in  the  corresponding  subdirectories  (CCOM,  CX^OM,  DC,  PP,  RECP, 
TI)ofCNTRL/. 

Appendix  A  contains  a  directory  tree  of  all  the  files  in  the  MDBS.  Specifically,  the  backend  pro¬ 
cesses  (bget,  bput,  cc,  dio,  dirman,  recp)  are  located  in  the  directory  Version_Name/BE/  and  are 
compiled  from  source  code  in  the  corresponding  subdirectories  (BCOM,  BCOM,  CC,  DIO,  DM, 
RECP)  of  BE/. 

TI  stands  for  test  interface,  which  has  become  the  normal  user  interface. 

The  ABDM  was  chosen  for  various  reasons  including  separate  modeling  of  base  data  and  meta 
data,  the  ease  of  clustoing  base  data  into  mutually  exclusive  sets  for  storage  on  backends.  Its 
semantic  richness  allows  for  otha  languages  to  be  translated  into  the  ABDL. 
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Figure  4;  Interrelation  of  System  Processes 
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area  of  Figure  5).  These  are  Ae  language  interface  layer  (LIL),  the  kernel  mapping  system 


UDM  :  User  Data  Model 
UDL  :  User  Data  Language 
LIL  ;  Language  Interface  Layer 
KMS  :  Kernel  Mapping  System 
KFS  :  Kernel  Fonnatting  System 
KC  Kernel  Controller 
M/Ll  :  Model/Language  Interface 
KDS  :  Kernel  Database  System 
KDM  ;  Kernel  Data  Model 
KDL  :  Kernel  Data  Language 


Figure  5.  The  Multimodel  and  Multilingual  Database  System 


(KMS),  the  kernel  formatting  system  (KFS),  and  the  kernel  controller  (KC).  When  starting 
MDBS,  the  user  chooses  which  data  model  and  data  language  interface  with  which  he 
wants  to  view  a  database.  The  model  he  chooses  is  called  the  user  data  model  (UDM)  and 
the  language  corresponding  to  that  models  the  user  data  language  (UDL).  Once  a  language 
has  been  chosen,  the  user  communicates  with  the  LIL  corresponding  to  that  language  (see 
Figure  6).  For  a  detailed  description  of  how  die  software  modules  interacL  see  (Bourgeois, 
1993,  pp.  7-24). 

Communications  with  the  backends  are  handled  by  two  processes,  CGET  and 
CPUT.  The  CPUT  process  sends  database  creation  and  queries  to  the  backends.  The  CGET 
process  receives  acknowledgments  for  creations  and  receives  the  results  of  the  queries  from 
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each  backend.  The  REQP  process  receives  database  transactions  from  the  KC  in  the  TI 
process  and -sends  them  to  the  backends  through  the  CPUT  process.  The  PP  coordinates 
responses  from  the  backends  and  returns  the  results  of  a  user  query  to  the  KC  in  the  TI 
process. 

The  IIG  process  is  used  to  determine  which  backend  will  insert  a  particular 
record.  During  an  insert  operation,  the  IIG  process  tells  a  specific  backend—via  the  CPUT 
process—whether  to  start  a  new  track  or  put  the  record  being  inserted  into  an  existing  track. 
The  nG  process  uses  a  space  utilization  table  from  which  it  can  determine  for  each  cluster 
which  backend  contains  the  last  full  track  of  the  cluster,  and  can  determine  the  fust 
available  track  for  inserting  a  new  track  of  clustered  records. 
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2.  Backend  Processes 

The  six  processes  on  each  backend  are  backend  get  (BGET),  backend  put 
(BPUT),  concurrency  control  (CC),  data  disk  input/output  (DIO),  directory  management 
(DIRMAN),  and  record  processor  (RECP).  Each  oackend  communicates  with  the  front  end 
and  the  other  backends  via  BGET  and  BPUT.  The  CC  process  allows  concurrent  database 
access  by  multiple  requests.  The  DIO  process  deals  with  disk  addresses  of  data  and 
optimizes  disk  accesses.  The  DIRMAN  process  accesses  the  meta-disk  to  perform  look-ups 
in  the  descriptor  and  cluster  tables  to  produce  disk  addresses.  Finally,  the  RECP  process 
performs  data  retrieval,  storage  and  processing  required  on  whole  records  and  sends  its 
output  to  the  BPUT  process. 

D.  SUMMARY 

There  a  multitude  of  computer  systems  in  industry,  government,  and  education  today. 
With  the  growing  expense  of  maintaining  these  systems,  there  is  a  greater  interest  in 
communication  between  heterogeneous  databases.  The  Multibackend  Database 
Supercomputer,  design  offers  one  solution  by  storing  all  databases  using  one  language,  a 
kernel  data  language,  and  providing  multiple  model/language  interfaces  to  the  users  of  the 
databases.  An  added  benefit  of  MDBS  is  its  scalability  which  is  inherent  to  its  parallel 
backend  design. 
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ni.  THE  ORGANIZATION  OF  A  SYSTEM  GENERATION 

SOFTWARE 


A.  BACKGROUND 

Computers  are  being  used  to  perform  many  tasks  which  we  used  to  do.  The  examples 
are  endless,  from  word  processing  to  database  management,  from  computer  aided  design 
in  manufacturing  to  process  control,  and  from  monitoring  someone  on  a  life-support  system 
to  monitoring  the  skies  for  incoming  missiles  or  “hostile”  airplanes.  As  computers  continue 
to  become  more  powerful,  they  can  be  used  to  perform  even  more  impressive  tasks.  It  is 
important  to  note  that  when  you  compare  a  computer  to  a  human,  the  computer  will  be 
superior  in  several  categories.  Compared  to  human  beings,  computers  are  much  cheaper, 
they  can  perform  various  functions  quicker  and  more  accurately,  with  well  written 
programs  they  are  more  reliable,  they  are  free  from  boredom,  they  can  analyze  input 
quicker,  they  can  produce  output  quicker,  they  do  not  need  to  unlearn  a  previous  task  or  do 
they  require  training  for  a  new  task,  they  can  operate  in  hostile  environments  (hot,  cold, 
high  radiation,  no  atmosphere),  and  they  do  not  have  personal  difficulties  (family,  marital, 

vacations,  pay  &  benefits,  or  deaths  to  deal  with).^ 

The  thesis  is  not  about  to  suggest  that  we  should  or  even  could  completely  replace 
humans  with  computers.  Instead  we  should  make  computers  do  as  many  tasks  currently 
done  by  humans  as  we  can.  One  focus  of  the  System  Generation  Software  is  automating  the 
tasks  that  can  be  performed  by  a  computer.  The  benefit  of  automation  is  to  give  more  time 
to  the  operator  or  user  so  they  can  do  their  primary  task  of  testing  the  system  or  using  a 
database. 

A  project  administrator  is  responsible  for  dividing  up  the  work,  assigning  it  to 
individuals,  monitoring  progress  of  the  work,  and  then  ensuring  the  pieces  are  integrated 
into  a  working  product  An  administrator  should  not  have  to  manually  create  working 


*•  Lecture  notes  by  Richard  Hamming  for  the  course  EC4000,  “Future  of  Technology,”  at  the  Naval 
Postgraduate  School,  Monterey,  CA,  1993. 
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copies  of  code  for  program  developers  to  use.  Nor  should  an  administrator  be  required  to 
manually  keep  track  of  all  code  needed  for  a  particular  version. 

B.  OVERALL  DESIGN  REQUIREMENTS 

From  a  user’s  point  of  view  there  are  three  important  requirements  for  any  software. 
They  are  ease  of  use,  doing  what  you  expect  the  program  to  do,  and  reliability.  From  the 
point  of  view  of  a  systems  programmer  or  someone  who  must  maintain  the  code,  an  added 
requirement  is  well  written  code. 

Ease  of  use  is  the  important  requirement  for  any  software.  Given  a  choice,  if  you  have 
to  use  a  system  that  is  not  easy  (or  intuitive)  to  use  then  you  will  probably  avoid  that  system. 
If  a  system  is  hard  to  use,  then  you  spend  more  time  learning  how  to  use  that  system  thus 
less  time  getting  your  job  done.  This  is  true  for  any  job  whether  it  be  supervising,  data  entry, 
data  queries,  software  maintenance,  or  software  improvement.  With  a  system  that  is  easy 
to  use,  the  more  work  the  program  can  take  from  the  supervisor,  developer,  or  user,  then 
the  more  useful  work  they  can  perform. 

C.  FUNCTIONS  OF  A  SYSTEM  GENERATION  MANAGEMENT  TOOL^ 

Some  repetition  in  our  lives  is  inevitable.  Every  work  day  many  people  wake  up,  get 

dressed,  eat  breakfast,  drive  their  car  to  work,  and  so  on.  However,  when  we  have  to  repeat 
things  on  a  computer,  we  create  macros,  aliases,  or  executable  files  to  get  the  same  work 
done  with  less  human  effort.  There  is  no  reason  to  not  apply  this  concept  in  every  computer 
application.  In  the  world  of  computers,  there  are  many  examples  of  how  a  programmer  can 
make  using  the  computer  easier.  Macros,  aliases,  and  executable  files  are  just  a  start, 
graphic  interfaces  were  made  possible  after  the  invention  of  the  mouse.  The  absence  of  a 
command  line  on  Apple  Computer  products  saves  the  user  from  remembering  specific 
commands  and  syntax  which  are  a  burden  to  users  of  UNIX. 


Back  ups  were  not  listed  as  a  function  performed  by  system  generation  software  since  a  back  up 
process  or  procedure  should  already  be  performed  on  a  routine  basis. 
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There  is  no  reason  to  not  automate  as  much  of  the  software  development  and  testing 
process  as  possible.  A  system  generation  system  should  be  able  to  create  and  manage 
multiple  versions  of  software.  Managing  versions  includes  the  version  name,  who  is 
working  on  the  version,  who  is  authorized  to  modify  the  files,  where  the  code  is  located, 
and  how  to  compile  the  code.  I  am  not  going  to  discuss  personnel  management  or  security 
issues  in  the  thesis  so  I  will  limit  the  discussion  to  managing  the  versions.  When  multiple 
versions  exist  on  a  system  which  consists  of  multiple  machines,  there  is  the  added 
requirement  to  coordinate  running  the  same  version  code  on  each  machine. 

With  MDBS  there  are  two  things  to  manage.  The  first  one  is  the  controller-backend 
configuration.  The  second  one  is  controlling  the  version  of  the  software  being  used. 

The,  controller-backend  configuration  is  primarily  a  concern  during  system  start-up. 
Starting  MDBS  takes  about  13  commands  (approximately  450  characters)  for  the  controller 
and  about  6  commands  (approximately  450  characters)  for  each  backend.  As  with  most 
computers,  there  is  no  allowance  for  typographical  errors.  It  is  unrealistic  to  expect  anyone 
to  type  all  commands  each  time  he  runs  the  system  with  backends.  The  first  step  to  reduce 
the  number  of  keystrokes  needed  to  start  the  system  is  to  group  commands  into  shell 

scripts^.  However,  since  there  are  multiple  combinations  of  backends  to  use  for  a  particular 

run,  the  number  of  possible  shell  scripts  becomes  enormous^.  If  you  want  to  erase  an 
existing  database  firom  the  data  and  meta  disks  in  preparation  for  loading  a  different 
database  or  starting  over,  you  have  even  more  commands  to  type.  There  is  added  work  of 
navigation  through  directories  and  machines.  To  execute  a  command  on  each  backend  you 
need  to  rsh  or  specify  the  full  path  name  (e.g.  /@db3/usr/work/mdbs/bin/zero  devsl).  Other 
complications  include  the  fact  that  one  process  on  each  machine  cannot  be  started  until  the 
other  five  processes  on  that  machine  are  executing  properly.  And  one  process  on  each 
machine  expects  to  hear  from  other  processes  on  that  machine  within  a  specific  time  frame 

similar  to  .BAT  files  on  IBM  compatible  personal  computers. 

Using  one  controller  and  every  combination  of  1  to  8  backends,  there  are  255  possible  backend 
configurations. 
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or  it  will  terminate.  Developing  software  and  queries  for  the  system  often  results  in  frequent 
system  start-ups.  When  you  use  multiple  backends,  keeping  track  of  the  whole  system 
becomes  a  hinderance  to  anyone  who  concentrates  on  what  he  needs  to  do  rather  than  to  get 
the  system  ranning.  This  suggests  automation  to  start  the  MDBS  to  keep  track  of  the 
backend  configuration. 

Version  control  is  needed  while  improving  the  software  of  any  project.  When  a  portion 
of  the  system  needs  to  be  added,  changed  or  improved  then  a  decision  must  be  made  about 
what  will  be  modified.  There  should  be  one  copy  of  the  original  code  and  another  copy  of 
code  that  will  be  modified.  This  can  be  done  in  several  different  ways.  Three  methods  are 
presented  for  discussion:  (1)  make  a  back-up  and  modify  the  on-line  version,  (2)  copy  the 
entire  code  for  the  system  and  modify  the  copy,  and  (3)  copy  a  minimal  subset  of  the  entire 
code  and  modify  the  copy  (using  as  much  of  the  original  code  as  possible). 

For  discussion  purposes,  the  first  two  are  similar.  By  making  an  entire  copy  of  the 
system,  you  have  just  made  it  difficult  to  incorporate  the  change  back  into  the  original, 
unless  you  plan  on  abandoning  the  original  version  and  staying  with  the  new  version. 
Methods  (1)  and  (2)  make  sense  if  the  system  consists  of  only  one  segment  of  code  being 
modified  or  you  are  die  only  person  working  on  the  entire  system  and  can  keep  track  of  all 
changes.  Now  lets  say  that  there  is  more  than  one  segment  of  the  code  being  modified.  Will 
you  copy  the  entire  system  for  each  version  and  have  multiple  revised  versions  to  merge 
into  one  version  for  the  final  product?  This  method  of  providing  copies  of  the  entire  code 
for  each  version  causes  the  amount  of  code  on-line  to  be  directly  proportional  to  the  number 
of  versions  being  modified.  Some  would  argue  that  memory  is  cheap  so  we  should  not  be 
concerned  about  multiple  copies;  however,  more  important  than  memory  is  time.  The  more 

copies  of  code  one  has  to  integrate,  the  longer  it  takes  to  integrate^  as  a  result,  the  overall 
cost  is  higher. 


Something  we  discovered  the  hard  way  when  attempting  to  merge  the  code  firom  vwsion  7  with 
VeiE.6oftheMDBS. 
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Now  consider  the  third  approach.  Say  the  original  software  is  partitioned  in  some 
manner  such  as  a  user-interface,  common  code,  debugging  code,  system  maintenance,  and 
backend  code.  If  software  from  one  partition  is  being  worked  on  then  only  that  partition  of 
files  needs  to  be  copied  for  someone  (or  a  group  of  programmers)  to  modify.  To  test  the 
partition  of  the  code  under  development,  one  compiles  the  partition  of  code  being  modified 
using  any  include  files  and  object  code  needed  from  the  unmodified  partitions.  Then,  if 
needed,  link  the  original  code  with  any  newly  created  object  code.  Once  all  the  executable 
file(s)  affected  by  the  modifications  have  been  compiled,  then  the  system  is  ready  for 
execution.  By  copying  only  the  partition  of  code  that  is  being  modified,  you  will  still 
have  the  original  working  version  available,  second  you  will  not  use  as  much  disk  space  as 
used  by  the  second  method,  and  third  you  will  make  the  job  of  re-integrating  modified  code 
into  the  original  easier  since  you  only  have  one  partition  to  integrate  for  each  version. 

For  the  sake  of  discussion,  let  us  say  that  one  partition  being  modified  is  for  the 
backend  and  that  no  other  partition  depends  upon  it.  To  test  the  whole  system  you  compile 
the  process  just  modified.  Assuming  there  are  multiple  processes  in  the  original  system, 
start  the  system  using  the  executable  code  for  the  one  process  just  compiled  and  the 
executable  code  for  the  processes  from  the  unmodified  code. 

Ideally,  the  System  Generation  Software  can  be  designed  to  automatically  select  only 
the  portions  of  code  needed  for  development  and  not  replicate  unneeded  code  from  an 
original  version  to  a  working  version.  To  do  this  would  require  a  complete  dependency 
graph  of  all  files  in  the  system  and  some  means  of  knowing  what  files  to  copy  for  every 
possible  change  to  the  system.  This  approach  could  be  automated  with  considerable  effort. 
Two  questions  arise,  “Is  it  worth  the  effort?”  and  “It  there  an  easier  way?” 

An  example  from  the  MDBS  system  follows.  On  db3  was  a  working  copy  of  the 

multilingual  database  system^.  This  version  did  not  have  an  Object-Oriented  interface  and 
there  were  three  thesis  students  ready  to  implement  one.  The  VerE.6  was  protected  so  no 

in  directory  /usr/work/cs4322/VerE.6 
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one  would  modify  the  code  and  a  new  directory^,  was  created  with  the  same  sub-directories 
as  VerE.6.  After  determining  that  none  of  the  backend  processes  and  only  one  of  the  six 

o 

controller  processes  would  be  affected,  the  entire  code  for  that  process  was  recursively 
copied  to  the  directory,  /usr/work/mdbs/rich/CNTRLAl.  The  files  copied  included  source 
code  and  make  files  for  the  Language  Interface.  To  compile  the  system,  the  main  make  file 
was  modified  to  only  compile  the  files  needed  for  the  language  interface  process.  There 
were  eleven  processes  not  affected  by  the  modifications.  So  to  test  the  system,  the  shell 
script  to  run  the  system  was  modified  to  start  the  eleven  processes  from  the  previous 
version  and  the  one  process  compiled  from  the  modified  code. 

Looking  back,  there  was  not  need  to  copy  the  code  from  all  language  interfaces  such 
as  relational,  hierarchial,  and  network;  however,  the  more  time  would  have  been  needed  to 
further  modify  the  make  files  to  support  copying  only  a  portion  of  the  language  interface. 
This  raises  the  question  which  a  supervisor  should  keep  in  mind,  “What  is  the  optimal 
amount  of  code  to  copy  for  a  new  version?”  If  you  copy  the  complete  system  code,  then 
you  have  a  large  amount  of  code  to  integrate  back  into  the  original  system.  K  you  copy  the 
minimal  amount  then  you  may  end  up  using  more  time  setting  up  the  files  and  make  files 
than  you  could  save  by  not  having  to  integrate  the  modified  code  into  the  original  system. 
A  supervisor  must  exercise  “good”  judgement  in  making  this  decision.  As  advanced  as 
computers  are,  they  are  not  ready  to  make  such  decisions. 

There  are  ways  to  assist  a  supervisor  in  making  these  decisions.  Case  tools  and  some 
compiler  packages  generate  dependency  graphs  automatically  for  files.  The  dependency 
graph  for  files  is  not  always  visible  to  the  user  even  though  the  system  uses  that  information 
to  con^ile  the  files  for  a  system.  A  less  reliable  source  are  the  dependencies  which  exist  in 
a  makefile  either  constructed  manually  for  C  files  or  automatically  for  some  compilers. 


/usr/work/mdbs/rich, 
dblti.out,.  the  language  interface. 
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D.  FUNCTIONS  IMPLEMENTED  FOR  MDBS 


The  four  functions  performed  by  my  system  generation  software  management  tool  are 
start-up,  reconfiguration,  distribution  of  code,  and  version  creation.  All  the  commands 
needed  to  start  the  MDBS  are  performed  automatically  by  running  the  system  generation 
software  management  tool  program.  To  change  the  number  or  names  of  backends  in  a 
system  configuration,  the  system  generation  software  can  reconfigure  which  backends  will 
be  used  for  the  next  start  up  of  MDBS.  When  modifications  are  made  to  one  or  more 
backend  processes,  the  recompiled  executable  code  must  be  copied  or  distributed  to  each 
backend.  And  finally,  after  someone  copies  the  applicable  portions  of  code  and  modifies 
the  configuration  header  file  as  described  in  the  source  code  comments  (see  Appendix  E) 
and  modifies  the  makefiles  as  discussed  earlier  in  this  chapter,  one  can  create  a  new  version 
of  the  MDBS  software  on  the  controller  and  set  up  each  backend  for  the  execution  of 
processes  created  by  the  new  version. 

E.  THE  ORGANIZATION  OF  MY  SYSTEM  CODE 

To  automate  the  start-up  and  some  management  of  MDBS,  I  broke  up  the  start  up 
commands  and  made  them  generic.  Backend  commands  will  work  on  any  backend.  On 
each  backend  I  put  commands  common  to  every  version  for  the  backend  into  the  bin 
direaory.  Since  different  versions  are  kept  in  different  directories,  I  decided  to  include  the 
version  name  in  the  directory  name  on  both  tfie  controller  and  the  backend.  On  the  backend, 
the  directory  for  the  executable  code  must  have  the  name  be.<version_name>.  For 
example,  be.greg  contains  the  6  processes  and  the  run.be  command  which  calls  them.  To 
automate  creation  of  the  shell  script  which  calls  the  6  backend  processes,  I  made  a  template 
file  which  is  edited  by  the  system  generation  program  to  create  the  ran.be  file  for  a  new 
version.  Once  created,  this  file  is  copied  to  each  backend. 

The  system  generation  software  source  code  and  use  of  shell  scripts  were  designed  to 
follow  good  software  engineering  practices.  These  practices  include  thorough 
documentation,  modularity,  a  header  file  for  constants,  and  the  absence  of  global  variables. 
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Flow  diagrams  can  be  found  in  Appendix  D  for  the  overall  algorithm  and  three  of  the  most 
complex  functions.  A  listing  of  the  source  code  can  be  found  in  Appendix  E.  Sample  shell 
scripts  called  by  the  system  generation  software  are  listed  in  Appendix  G. 

F.  A  SUMMARY 

As  stated  in  the  design  requirements  earlier  in  this  chapter,  a  system  generation 
management  tool  should  be  easy  to  use,  do  what  it  is  expected  to  do,  be  reliable,  and  follow 
good  programming  principles.  The  software  I  have  developed  meets  these  requirements  as 
demonstrated  by  the  ease  of  use  by  first  time  users,  doing  what  the  user  expected  it  to  do, 
recovering  from  user  errors  often  with  specific  error  or  warning  messages,  and  setting  the 
example  for  good  programming  principles  in  its  code,  users’  manual,  and  documentation. 
Performance  and  ease  of  use  of  MDBS  have  improved  since  the  system  generation 
management  tool  has  been  installed. 


25 


IV.  USING  THE  SYSTEM  GENERATION  SOFTWARE 


The  system  generation  software  can  start  up,  reconfigure  backends,  distribute 
processes  when  needed,  and  assist  in  the  creation  of  new  versions.  A  step  by  step 
description  of  how  to  use  the  software  is  contained  in  Appendix  B. 

A.  TO  START  MDBS 

The  system  generation  software  has  simplified  the  start-up  process.  To  start  MDBS, 
all  you  type  is  begin.  The  system-generation  program  will  look  up  the  last  MDBS 
configuration  used  and  display  the  configuration.  After  answering  a  few  questions,  most  of 
which  only  require  ‘y’  or  ‘n’  for  a  response,  the  system  starts  up  in  the  configuration  you 
wanted. 

B.  TO  RECONFIGURE  THE  BACKENDS 

The  number  of  backends  can  depend  upon  the  size  of  a  database  you  plan  on  mnning 
or  the  testing  you  want  to  perform.  Changing  which  backends  to  include  in  the  next  run  of 
MDBS  is  fairly  simple.  After  typing  b^n  or  begin  -r,  the  system  generation  software  will 
display  on  the  screen  the  cuirent  version  name,  controller,  and  backends  in  the 
configuration.  To  change  the  configuration,  answer  ‘y’  to  the  question  “do  you  want  to 
reconfigure  the  Back  Ends?  (y/n)”.  For  each  available  backend,  you  will  answer  ‘y’ 
or  ‘n’  indicating  which  backends  you  want  in  the  new  configuration.  After  you  have 
responded  to  each  backend  and  are  satisfied  with  the  new  configuration,  it  is  saved  in  the 
configuration  file.  If  you  do  not  choose  any  backends,  then  system  generation  program  will 
reread  the  configuration  file  and  use  the  previous  configuration. 

C.  FOR  CONFIGURATION  MANAGEMENT 

After  modifying  code  in  MDBS  which  affects  any  backend  process,  you  need  to 
recompile  the  affected  code  and  copy  the  resulting  executable  files  to  each  backend.  Start 
the  system  generation  management  program  by  typing  ‘b^n  -d’.  The  program  will  ask  if 
you  need  to  recompile  the  backend  code.  If  you  need  to  recompile,  it  will  initiate  the 
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compiling^  and  then  copy  the  compiled  code  to  each  backend.  As  long  as  you  have  created 
the  applicable  directory  on  each  backend  (or  previously  used  the  system  generation 
software  to  create  them)  then  all  the  backend  executable  code  will  get  copied  to  the 
directory  on  each  backend  corresponding  to  the  current  version  name. 

D.  FOR  VERSION  CONTROL 

Creating  a  new  version  involves  a  fair  amount  of  planning.  Someone  will  copy  the 
applicable  portions  of  code  in  the  MDBS  hierarchy  into  new  directories  containing  the 
name  of  the  new  version  to  be  worked  on.  Then  he  modifies,  as  necessary,  the  make  files 
to  reflect  the  directory  structure.  The  system  generation  and  software  management  tool 
program  is  started  with  ‘begin  -v’  option.  System  generation  program  will  ask  for  the  new 
version  name.  After  you  enter  and  confirm  the  new  version  name,  the  program  will  change 
the  version  name  in  the  configuration  file.  My  system  generation  program  will  verify  that 
the  include  file  was  updated  to  reflect  the  current  version  name  as  the  default  version  name. 
The  the  system  generation  program  wUl  verify  that  the  new  version  name  can  be  found  in 
the  current  directory. 


*■  CfxnpUing  should  be  done  before  niniung  the  system  generation  software  management  tool  pro¬ 
gram  since  you  will  be  unable  to  verify  the  compilation  was  successful  before  the  program  copies 
code  to  the  backends.  If  compilation  was  unsuccessful,  you  will  have  to  exit  this  program  to  fix  any 
errors  then  recominle. 
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V.  AREAS  FOR  FURTHER  STUDY  WITH  THE  MDBS 


The  purpose  of  MDBS  is  to  demonstrate  to  academia,  business,  and  government  that 
building  a  database  computer  which  is  multilingual  and  scalable  (with  multibackends)  is 
not  only  possible,  but  a  practical  solution  to  the  problem  of  data  sharing  between 
heterogeneous  databases.  As  with  any  research  project,  there  are  many  ways  that  project 
can  be  improved.  In  this  chapter  I  have  listed  a  few  areas  of  improvement  and  directions 
for  further  research.  The  topics  are  listed  in  their  order  of  importance  as  I  see  them. 

A.  UPGRADING  EQUIPMENT 

As  time  passes,  computer  technology  improves.  The  current  MDBS  is  implemented 
on  Integrated  Solutions,  Inc.  Optimal  V24  model.  These  machines,  commonly  referred  to 
as  ISrV’s,  were  purchased  during  the  early  1980’s.  As  discussed  in  Chapter  I,  the  speed  of 
a  central  processor  has  increased  several  times  and  disk  drives  capacity  has  increased  at 
least  10  times  since  the  early  1980’s.  Advances  in  technology  and  time  have  a  cost 
Machines  become  out  of  date.  Machines  also  wear  out  and  the  hardware  requires  more 
frequent  repairs.  There  comes  a  time  when  the  repair  costs  exceed  replacement  costs.  The 
MDBS  will  be  getting  SUN  workstations  to  replace  the  ISIV’s.  Anyone  who  has  upgraded 
equipment  even  for  “upward  compatible”  computers,  knows  that  one  cannot  just  copy  the 
files  and  recompile.  Every  once  in  a  while  you  might  get  lucky  and  not  have  to  debug,  but 
more  often  than  not  there  will  be  differences  between  machines  which  will  require 
debugging  the  system. 

One  con^lication  with  multiple  hardware  configurations  is  how  to  handle  future 
upgrades  to  MDBS.  If  software  development  is  shifted  completely  to  dbll  as  controller 
with  dbl2  and  dbl3  as  backends,  how  will  any  improvements  be  transferred  to  the  two 

machines  on  loan?^  Any  future  changes  can  not  be  distributed  by  making  tape  backups  and 
sending  them  via  mail  to  be  loaded  on  dbl  and  db2.  Since  there  are  differences  between  the 

dbl  and  db2  at  Pt  Mugu  and  SUPSHIPS  San  Francisco. 
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new  and  old  hardwares,  the  changes  will  have  to  be  duplicated  and  tested  on  db3,  db4,  or 
db7  (with  the  same  ISFV  architecture),  then  copied  to  tape  before  sending  to  dbl  and  db2. 
This  process  has  a  fair  amount  of  overhead. 

B.  INTEGRATING  THE  TWO  MAIN  VERSIONS 

There  are  two  versions  of  the  Multilingual  Multimodel  Database  System  at  the  Naval 

Postgraduate  School  (NPS).  One  version^  uses  the  same  machine,  db3,  as  the  controller  and 
the  backend.  This  version  implements  the  five  language  interfaces  discussed  in  Chapter  II. 

The  second  version^  uses  one  machine  as  the  controller  and  one  or  more  machines  as 
backends.  The  second  version  works  reliably  for  the  attribute  based  data  model  and 
sporadically  for  the  other  data  models. 

The  Object-Oriented  Database  Model  has  been  added  to  the  version  on  db3.  Since  that 
version  supports  multilingual,  not  multibackend  parallel  database  operations,  the  Object- 
Oriented  code  should  be  integrated  into  the  multibackend  version  on  dbS.  The  other 
language  interfaces  must  also  be  examined.  The  multibackend  version  on  dbS  has  a 
significant  amount  of  multilingual  code.  Since  the  dbS  version  does  not  work  consistently 
with  the  languages  implemented,  the  multilingual  code  should  be  compared  with  the  db3 
multilingual  code  with  functions  consistently.  Improvements  to  the  dbS  code  should  be 
made  so  there  is  at  least  one  reliable  multilingual  and  multibackend  system. 

C.  IMPROVING  EXISTING  MODULES 

The  Daplex,  or  functional,  database  language  does  not  work  on  MDBS.  Much  e^ort 
was  put  into  this  project;  however,  the  task  has  not  been  completed.  Work  on  the  functional 
data  model  has  resumed.  The  functional  model  can  be  used  with  artificial  intelligence 
applications.  Having  a  functioning  functional  database  model  would  be  an  asset  to  the 
MDBS. 


VeiE.6  in  the  cs4322  account  on  db3. 
greg  in  the  mdbs  account  on  db8. 
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The  Object  Oriented  Data  Model  as  implemented  on  MDBS  resembles  the  Relational 
model  too  closely.  More  work  on  the  OODM  must  be  done  to  make  it  truly  object  oriented. 
It  currently  has  limited  inheritance  implemented.  Implementing  full  inheritance  and  the 
covering  property  are  under  development 

D.  THE  USER  INTERFACE 

Many  systems  exist  with  a  menu  driven  user  interface.  Menu  driven  displays  are 
adequate  for  some  purposes,  but  when  they  call  for  non  multiple  choice  answers,  the  user 
may  need  additional  information  displayed  on  the  screen,  possibly  in  a  different  window. 
Anyone  who  has  used  Windows  on  a  personal  computer,  or  X  Windows  on  a  workstation 
can  admit  that  their  productivity  increases  when  they  have  the  ability  to  access  references 
and  get  helpful  information  in  other  windows  while  they  are  doing  their  main  work  in 
another  window. 

Pop  up  windows  also  make  the  windows  environment  more  user  friendly  since 
information  can  be  seen  when  you  need  it  When  you  no  longer  need  the  information,  you 
can  remove  the  window,  or  the  system  can  be  programmed  to  remove  the  window. 

On  the  older  ISIV  equipment  there  is  a  primitive  windows  environment  available,  but 
the  MDBS  wftware  does  not  automatically  utilize  any  window  features.  The  user  has  to 
explicitly  open  windows  that  he  needs.  Having  multiple  windows  is  very  useful  when 
running  MDBS.  At  start  up  and  any  timt  during  system  operation,  a  user  can  find  out  if  the 
multiple  processes  are  running.  Multiple  windows  can  be  used  to  find  the  exact  names  for 
databases,  query  files,  ten^late  files,  and  other  files.  Even  these  uses  of  multiple  windows 
require  the  user  to  have  some  knowledge  of  the  system.  One  improvement  would  be  to 
display  the  valid  choices  in  a  window  instead  of  relying  on  the  user  to  remember  the  exact 
file  name,  database  name,  or  even  the  full  path  name  of  the  file.  Another  improvement 
would  be  to  have  a  separate  process  which  monitors  the  main  processes  and  lets  the  user 
know  all  is  well  and  what  died  if  all  is  not  well  with  the  multiple  system  processes.  This 
can  be  accomplished  by  X  Windows  on  workstations  using  the  UNIX  operating  system.  A 
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recent  graduate  from  the  Naval  Postgraduate  School  wrote  a  windows  environment 
interface  for  MDBS.  This  interface  was  written  using  transportable  applications 

environment  (TAE)^  Plus  and  could  be  used  as  a  template  to  start  such  an  interface. 

E.  IMPLEMENTATION  OF  MORE  CROSS-MODEL  ACCESS 

As  displayed  in  Figure  1,  there  are  only  two  cross-model  database  mappings 
implemented.  Load  using  Hierarchial  (DL/I),  Query  with  Relational  data  language  and 
Load  using  Object  Oriented  Data  Language,  Query  with  Relational  Data  Language. 
Studying  and  implementing  more  cross-model  accessing  can  make  MDBS  harder  to 
overlook  when  academia,  business,  and  government  attempt  to  solve  the  growing  problem 
of  heterogeneous  databases. 

F.  REDISTRIBUTION  OF  DATA  IN  AN  EXISTING  DATABASE 

A  commercial  system  with  multiple  backends  must  be  capable  of  error  recovery.  If 
one  backend  fails  then  the  system  should  be  able  to  be  reconfigured  automatically  or  with 
minimal  human  intervention.  One  database  application  is  a  bank.  A  database  for  a  bank 
cannot  be  interrupted  due  to  a  failure  of  one  backend  for  hours  or  days  while  the  data  (on 
the  backends)  is  reconstructed.  To  solve  this  type  of  problem  on  MDBS,  three  major  areas 
must  be  addressed,  duplication  of  base  data  on  backends,  and  redistribution  of  the  base 
data,  and  database  integrity. 

When  MDBS  data  is  stored  on  multiple  backends,  the  data  is  partitioned.  When  the 
number  of  backends  changes,  the  partitioning  also  changes.  Some  system  monitoring 
software  should  be  able  to  either  copy  the  back  up  data  for  the  backend  which  failed  and 

copy  it  to  a  backend  not  in  use^  or  to  back-up  (store)  all  existing  data  on  a  common  device 
(e.g.  magnetic  tape,  multiple  disk  drives)  and  then  reload  the  database  on  the  new 
configuration.  On  MDBS  this  could  be  done  by  storing  each  data  partition  on  at  least  two 

TAE  is  a  registered  trademark  of  the  National  Aeronautics  &  Space  Administration. 

To  keep  the  same  database  partitioning  which  would  maintained  by  havirc  the  same  number  of 
backends  in  the  new  configuration. 
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backends  and  having  one  or  more  backends  as  ready  spares  which  a  monitoring  system 
could  use  if  a  backend  in  use  fails.  Database  integrity  is  discussed  in  (Quantock,  1989). 

G.  UPGRADING  EXISTING  CODE  FROM  C  TO  C++ 

Some  people  say  that  the  writing  is  on  the  wall  for  C  code.  C++  is  an  Object-oriented 
language  and  can  do  much  more  than  C  could.  Object-oriented  databases  should  be 
implemented  as  part  of  a  multilingual,  multimodel  database  system.  Future  code  for  the 
multilingual,  multimodel  database  system  should  be  written  in  a  higher  level  language  than 
C,  namely  C++. 

H.  MULTIPLE  USER  AND  MULTIPLE  DATABASE 

MDBS  is  a  research  database  computer  and  does  not  have  everything  that  is  expected 
of  a  commercial  database.  An  area  of  development  not  being  worked  on  is  allowing  MDBS 
to  have  multiple  users  and  multiple  databases  loaded  simultaneously.  A  commercial 
database  system  must  be  able  to  handle  multiple  users  simultaneously  and  should  be  able 
to  handle  multiple  databases  concurrently. 
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VI.  PROBLEMS 


MDBS  is  not  a  commercial  product  It  has  been  worked  on  mostiy  by  students  earning 
their  masters  degree  in  computer  science  at  the  Naval  Postgraduate  School.  Since  MDBS 
is  not  meant  to  be  commercial  strength,  there  is  less  emphasis  placed  on  the  optimal  user 
interface,  sufficient  source  code  comments,  or  on  error  recovery  and  more  upon  showing 
that  something  new  can  be  done  with  the  system.  Each  language  interface  implements  a 
subset  of  the  commands  for  that  language.  The  first  language  interface  mapping  was  from 
relational  (SQL)  to  the  attribute  based  data  language  and  model.  This  was  the  best  written 
and  documented  schema  transformation  and  query  translation.  The  subsequent  language 
interfaces  were  based  on  the  design  used  by  the  relational  model  and  language.  Although 
there  have  been  several  paid  systems  programmers  to  get  the  multibackend  system  running 
and  occasionally  to  make  it  better,  there  are  currently  no  paid  staff  responsible  for  MDBS. 
Unfortunately,  the  system  stops  unexpectedly.  Some  reasons  for  the  system  to  stop  are  data 

entry  error^  a  software  error,  and  a  hardware  failure. 

A.  RETRIEVE  -  COMMON 

The  kernel  data  language  uses  the  retrieve  common  command  to  perform  a  join  on 
attribute-value  pairs  (similar  to  equi-join  in  the  relational  model).  During  the  retrieve 
common  process,  back-ends  must  communicate  with  each  other.  The  inter  process  (IP) 
buffer  has  a  size  of  1000  characters.  Messages  sent  between  processes  can  have  up  to  1425 
characters.  The  operating  system  uses  a  flushing  mechanism  for  messages  longer  than  1000 
characters.  Some  indications  are  that  this  mechanism  is  not  given  enough  time  to  complete 
the  flushing  task  before  the  arrival  of  the  rest  of  the  message  or  the  arrival  of  the  next 
message.  As  a  result,  the  messages  are  damaged.  The  damaged  messages  causes  errors 
when  calculating  virtual  addresses  for  the  data  (Hammond,  1992,  p.42).  It  was  proposed 


Made  by  the  database  r  /:  ji  a  corresponding  lack  of  error  recovery  built  into  the  source  code 
by  a  programmer. 
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that  an  upgrade  of  the  operating  system^  might  solve  these  problems.  This  remains  to  be 
seen  as  the  system  is  being  ported  over  to  the  more  powerful  workstations. 

A  second  problem  with  retrieve  common  is  a  result  of  backends  having  to 
communicate  amongst  one  another.  For  commands  other  than  retrieve  common,  backends 
only  communicate  with  the  front  end.  The  front  end  will  either  broadcast  a  message  to  one 
or  to  all  backends.  In  response  each  backend  win  send  one  or  more  messages  to  the  front 
end.  If  there  are  several  retrieve  commons  being  performed,  or  if  one  retrieve  common  has 
a  large  set  of  data  to  join,  the  amount  of  inter-backend  communications  increases.  This 
increased  traffic  on  the  ethemet  causes  the  number  of  collisions  to  increase.  As  a  result, 
inter-machine  communications  are  slowed  down.  This  delay  is  severe  enough  that  during 
the  bench  marking,  the  retrieve  common  was  not  included  in  the  commands  tested  (Hsiao, 
1991,  p.  54).  This  communications  related  problem  may  be  solved  by  using  a  protocol 
similar  to  the  token  ring  protocol.  Using  a  fiber  distributed  data  interface  (FDDI)  would 
solve  the  communications  bottleneck  two  ways.  First,  the  data  rate  is  much  higher  than  the 

ethemet  (10  vs.  100  Mega  bits  per  second)^.  Second,  the  protocol  does  not  permit 
collisions. 

B.  SOFTWARE  RELIABILITY 

Since  there  is  not  error  recovery,  the  system  locks  up  at  unexpected  times.  You  can  be 
using  a  “perfectly”  coded  language  interface  and  the  system  can  still  crash  or  lock  up.  There 
are  many  possible  problems.  One  problem  could  be  due  to  the  design  of  the  Ethemet 
(broadcasting  messages  can  collide  and  not  be  detected).  If  a  message  gets  lost,  the 
controller  can  wait  indefinitely  expecting  a  message  which  will  never  be  received.  If  the 
backend  sent  the  message,  then  it  will  not  send  it  again.  This  problem  could  also  be  due  to 

^  The  ISIV’s  were  installed  with  ISI  4.2  B^eiey  Software  Distribution  (BSD)  Release  3.07.  The 
operating  system  was  upgraded  to  4.3  BSD  Yirtual  VAX  11  Version  490145  Rev  B  dated  October 
1987.  When  the  movement  to  Sun  Wwlcstations  is  completed,  MDBS  will  be  using  UNDC  Sun  OS 
4.1.1. 

If  messages  can  not  be  sent  to  multiple  recipients  then  a  network  of  10  backends  would  only  ben¬ 
efit  from  the  lack  of  collisions,  not  from  the  higher  data  rate. 
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how  sockets  are  handled  by  the  operating  system  (BSD  4.3)  as  discusses  in  the  previous 
section. 

MDBS  is  meant  to  show  that  a  multimodel,  multilingual,  multibackend  database 
system  can  be  implemented.  Once  government  and  industry  realize  the  strengths  of  both 
the  multilingual  and  multibackend,  then  the  idea  could  spread  to  a  commercial  strength 
product  which  would  better  handle  data  entry  errors  and  remove  most— if  not  all— software 
errors. 


35 


vn.  CONCLUSIONS 


As  more  demands  are  placed  on  computing  resources,  it  is  vital  that  database  systems 
be  designed  which  get  the  most  out  of  each  computing  dollar.  Heterogeneous  databases 
exist  in  most  large  organizations.  Their  existence  causes  a  lack  of  effective  data  sharing, 
duplication  of  data,  and  wasted  computing  resources.  One  way  to  solve  these  problems  is 
to  use  a  kernel  data  model  and  language  to  store  all  databases.  Users  can  view,  load,  and 
modify  a  database  in  whatever  data  model  and  language  they  would  like  to  use.  The 
Multilingual,  Multimodel,  Multibackend  Database  Supercomputer  located  at  the  Naval 
Postgraduate  School  has  demonstrated  the  feasibility  of  such  a  system. 

A.  WHAT  I  HAVE  ACCOMPLISHED 

Running  multi-process  systems  is  hard  enough  on  multiple  machines,  but  adding 
multiple  versions  on  top  of  this  becomes  a  difficult  task.  The  more  we  can  automate,  the 
less  a  supervisor,  software  developer,  or  user  will  have  to  know  or  do  to  ran  the  system. 
There  is  a  definite  need  for  automation  wh^e  ever  possible.  Human  beings  do  not  need  to 
waste  time  remembering  exact  command  names  and  machine  specific  peculiarities. 
Innovative  ideas  in  the  computer  industry  have  continued  to  increase  productivity  of 
computers  and  users.  Other  ideas  which  have  improved  the  user  interface  and  thus 
productivity  are  icons,  pull  down  menu’s,  and  a  consistent  use  of  color.  The  lack  of  a 
command  line,  although  intimidating  to  more  experienced  command  line  users,  shifts  the 
burden  of  remembering  correct  commands  firom  the  user  to  writing  a  better  interface  to  the 
programmer  of  that  interface. 

After  using  the  MDBS  I  realized  its  merits  and  its  possibilities  for  solutions  to  real 
world  problems.  I  also  realized  that  the  learning  curve  to  use  the  system  is  long  and  steep. 
One  of  my  goals  is  to  make  the  MDBS  easier  to  leam  and  easier  to  use.  With  the  code  I 
have  written  (see  Appendix  E),  I  have  automated  running  MDBS.  Getting  the  system  to  run 
does  not  solve  all  the  problems.  Knowing  how  the  system  operates  is  also  important  in 
easing  the  learning  curve. 
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Documentation  is  very  important  and  cannot  be  stressed  enough.  The  lack  of 
comments  today  in  a  program  that  you  know  inside  and  out  will  become  a  huge  barrier  for 
anyone— including  yourself— from  looking  at  that  code  in  the  future  and  being  able  to 
quickly  understand  not  only  what  it  does,  but  how  the  program  does  it!  I  have  thoroughly 
documented  my  code  and  have  gathered  information  to  explain  the  overall  setup  of  the  file 
system  which  supports  MDBS.  Within  that  I  have  described  many  of  the  temporary  and 
permanent  files  that  make  MDBS  operate.  Together  with  the  User’s  Manual  (Appendix  A 
to  Bourgeois  1993),  which  explained  how  to  use  the  multilingual  aspect  of  MDBS,  it  is  now 
much  easier  for  someone  new  to  learn  the  MDBS  from  a  database  user‘s  perspective  and 
from  a  systems  programmers  viewpoint. 

B.  SOME  FUTURE  AREAS  OF  RESEARCH  RELATED  TO  THIS  WORK 

Areas  for  possible  future  research  are  discussed  in  the  previous  chapter.  The  areas 
which  need  to  be  worked  on  are  the  upgrading  of  equipment,  improving  the  reliability  of 
the  software,  object  oriented  interface,  and  more  cross-model  accessing  capability. 
Upgrading  of  equipment  to  Sun4  workstations  is  in  progress  which  may  improve  the 
reliability  of  existing  software.  A  full-time  staff  or  contractor  needs  to  work  on,  test,  and 

improve  the  performance  of  the  multibackend  version  of  MDBS^  to  at  least  the 
performance  of  the  single  backend  version  of  MDBS^. 

The  object  oriented  interface  needs  to  be  less  relational  and  more  object-oriented.  The 
interface  needs  to  be  integrated  into  the  multibackend  version  of  MDBS  which  will  be  done 
once  the  system  is  ported  to  the  Sun4  workstations. 

And  finally  the  cross-model  accessing  capability.  I  believe  this  is  the  greatest  strength 
of  MDBS.  If  academia,  business,  and  government  would  see  the  benefits  of  a  Federated 
Database  System  using  a  kernel  data  model  and  language,  there  could  be  great  steps  toward 
data  sharing  and  the  eventual  reduction  in  the  number  and  sizes  of  databases  in  the  world 

’•  The  current  version  name  is  ‘greg’  which  is  an  improvement  on  version  ‘7’. 

Implemented  on  db3  as  ‘VerE.6’. 
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today  given  a  constant  amount  of  information  to  store.  This  reduction  would  result  in  a 
tremendous  savings  in  the  cost  of  maintaining  existing  databases. 
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APPENDIX  A.  FILE  ORGANIZATION  OF  MDBS 


This  appendix  presents  an  overview  of  the  directory  structure  and  important  files  used 
to  operate  MDBS.  First  we  review  the  basic  hierarchy  of  the  DATABASE  software  files 
starting  at  the  Version’s  Directory  which  is  maintained  on  the  controller  (currently  ISFVS). 


/u/mdbs/ 


BE/  COMMON/  CNTRL/  bin/  run/ 


BCOM/ 

ccy 

COMMON/ 

DIO/ 

DM/ 

RECP/ 

I 

Object/ 


TI/  CCOM/ 

COMMON/ 

no/ 

PP/ 

Object/  LangIF/  REQP/ 


Object/ 

include/  lib/  src/ 


Figure  7.  Directory  Structure  of  the  MDBS  files  on  the  controller 


A.  CONTROLLER 


BE:  Directory  containing: 

the  following  6  sub-directories  which  create  the  6  Back  End  Processes:^ 
COMMON:  source  files  common  to  all  Back  End  processes. 

CC:  Concurrency  Control. 

DIO:  Data  Disk  Input/Ou4>ut 

DM:  Directory  (meta  disk)  Management  -  where  records  are  put 
RECP:  Record  Processing. 

BCOM:  Back  End  Communications  -  Machine  to  machine, 
the  executable  code  for  each  process  (bget,  bput,  cc,  dio,  dirman,  recp).exe 


CNTRL:  Directory  containing: 

the  following  6  sub-directories  which  create  the  6  Controller  Processes:^ 

COMMON:  source  files  common  to  all  Controller  processes. 

CCOM:  Control  Communications  -  cget  and  cput  processes. 

RECP:  Request  Preparation  -  ABDL  Parser 
PP:  Post  Processing  - 

DO:  Insert  Information  Generation  -  determine  in  what  track  to  put  a  record 
TI:  Test  (user)  Interface  -  source  and  libraries  for  each  non  kernel  Data 
Manipulation  Language  except  ABDL 

the  executable  code  for  each  process  (cget,  cput,  iig,  pp,  reqp,  ti).exe^ 


Each  has  its  own  Object/  directory  used  during  die  compilation  process  for  include  and  object 
files. 

The  .exe  extension  is  used  to  diffeientiatB  this  vosion  firom  the  cs4322  version  which  uses  the 
.oat  extension. 

See  footnote  1. 

See  footnote  2. 
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COMMON:  Directory  containing: 

source  common  to  Back  End  and  Controller 

bin:  Directory  containing: 

shell  scripts,  binaries,  and  makefile  used  to  copy  or  move  files  and  data,  view  data  and 
meta  data,  and  compile  any  and  all  parts  of  the  system, 
constants  -  display  meta  table  sizes  (source  in  Version_Name/BE/DM) 
cpcount  -  copy  a  file  for  a  given  number  of  bytes.  Use  to  copy  meta  and  data  disks  to 
other  BE  (source  in  Version_Name/BE/DIO) 
cpydisks  -  script  using  cpcount,  NOT  to  be  used  on  CONTROLLER 
cpytobe  -  copy  BE  processes  to  backend  (not  used) 
disp  -  display  dynamic  meta  table  (source  in  Version_Name/BE/DM) 
gdb  -  generate  lest  database  of  arbitrary  size  from  template  file  (usage:  gdb  D.t 
D[.r])(source  in  Version_Name/CNTRL/n) 
makeall  -  shell  script  which  calls  make  in  the  appropriate  directory  for  each  of  the  12 
processes 

makefile  -  can  be  called  with  4  options,  make  aU  -  calls  makeall;  make  auto  -  compiles 
main.c;  make  clean  -  deletes  all  object  files;  make  dist  -  creates  a  tar’d, 
compressed  backup  file  of  the  distribution, 
rectag  -  print  out  formatted  data  from  data  disk  (source  in  Version_Name/BE/DIO) 
zero  -  writes  O’s  to  a  file;  used  to  erase  meta  and  data  disks  before  loading  a  new 
database  (source  in  Version_Name/BWDIO) 

confrgure.h  -  include  file  for  main.c 

main.c  -  source  for  automatic  reconfiguration,  redistribution,  and  execution  of  MDBS 
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run:  Directory  for: 

files  (1)  created  by  the  controller  during  execution  of  MDBS  storing  information 
about  the  database  and  (2)  needed  to  start  and  restart  MDBS. 

.Syntax  - 
.TransFile  - 

.config.db  -  version  name  and  list  of  backend  names  in  current  configuration.  Used  by 
main  (specifically  from  Version_Name/bin/main.c  program). 

.cuiT_fiile  -  temporary  file  used  by  some  language  interfaces 

.exe.awk  -  pattern  matching  for  awk  command  in  stop.<controller>  (e.g.  stop.dbS) 

.output  -  last  query  result  displayed  to  the  screen. 

.qry_file  -  the  last  displayed  create/query  (SQL)  or  dbd/request  (DL/I)  file, 
trace/  -  directory  for  trace  files  from  the  5  controller  processes 

<process  name>.tr  -  trace  for  most  recent  execution  of  that  process 
database  list  files:  •  (.dap.dbl,  .hie.dbl,  .netdbl,  .obj.dbl,  .sql.dbl)  one  file  for  each 
language,  one  database  name  per  line.  These  files  are  created  each  time  MDBS  is  run.  Each 
file  is  created  from  its  respective  .*dbs.dat  files. 

database  data:  -  (.dapdbs.dat,  .hiedbs.dat,  .netdbs.dat,  .reldbs.dat)  one  file  for  each 
language,  a  listing  of  all  information  needed  to  create  a  database  catalog  (.*.cat)  file  for 
each  known  database.  These  files  are  updated  when  a  database  schema  is  added  via  creation 
or  translation/transformation  from  another  schema. 

database  catalogs:  -  (.DTH.cat,  .DTH2cat,  etc.)  one  file  for  each  database  listed  in 
each  .*.dbl  file.  Information  is  created  at  run  time  (probably  when  the  specified  language 
interface  is  requested;  could  be  at  start  of  ti.c)  by  reading  the  corresponding  .dat  file, 
shell  scrips:  all  called  by  Version_Name/bin/main.c 
startcntrl  -  start  the  first  5  of  6  controller  processes 
master.run.be  -a  template  used  to  create  run.be  for  a  new  version  of  software 
run.be  -  latestrun.be  created  from  master.run.be 

stop.<machine_name>  -  stop  processes  remaining  from  last  run  of  database 
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zero.<back_end_name>  -  erase  data  from  back  end  meta  and  data  disks 
zero.<controller_name>  -  erase  files  storing  data  of  previous  database 

/GNTRL/n/LanglF  -  Test  Interface,  Language  Interface 

include:  files  included  in  .c  files  at  compile  time  (e.g.  .del,  .ext,  .h,  .def) 
lib:  archived  files.  One  per  Data  Language  (sql.a,  dli.a,  dml.a,  dap.a  obj.a). 
sre:  source  code  for  each  data  manipulation  language  (DML).  Sub  directories  are: 
Com:  three  files  used  by  each  data  language. 

Sql:  Relational  Data  Language  (SQL)  - 
Dli:  Hierarchial  Data  Language  - 
Dml:  Network  Data  Language  - 
Dap:  Daplex  Data  Language  - 
Obj:  Object  Oriented  Data  Language  - 
Each  Data  Language  has  up  to  5  sub  directories: 

Alloc:  Allocation  of  memory  for  various  data  structures. 

Kc:  Kernel  control  -  communicates  between  ABDL,  Kms,  and  Kfs. 

Kfs:  Kernel  formatting  system  -  collects  and  formats  data  from  Kc  for 
display  in  Lil 

Kms:  Kernel  mapping  system  -  parse  queries  generated  by  Lil.  Sends 
to  Kc. 

Lil:  Language  Interface  Level  -  menus  for  user.  Communicates  with 
Kms,  Kc,  and  Kfs. 
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The  following  directories  and  files  must  be  present  on  the  CONTROLLER  within 
$HOME  or  the  login  directory: 

Version_Name/  described  above.  As  a  minimum,  you  need  the  6  executable  files  for 
the  controller  processes  and  all  of  the  run  directory  to  start  the  system. 

UserFiles/  formats  of  files  are  described  in  the  User’s  Manual,  chapter  DC  (Bourgeois, 
1993,  pp.  84-88). 

DB_NAME.d  -  descriptor  file.  Places  restrictions  on  values  and  ranges  of 
values  of  attributes.  Used  in  clustering. 

DB_NAME.r  -  mass  load  file.  For  Relational  and  Attribute  Based  Models. 
DB_NAME.t  -  template  file.  Structures  of  the  attribute-based  database. 
DB_NAME#n  -  ABDL  query  list  Also  known  as  Traffic  Unit 
DB_NAME<sql,  dml,  dli,  obj,  dap>db  -  schema  file  -  needed  by  system  to 
create  a  new  database  unless  you  do  it  interactively.  This  file  contains 
enough  information  to  generate  the  descriptor  and  template  files 
OTHER  -  files  such  as  DB_NAMEsqlreq  create  table  and  SQL  select 
Sockets/  1 1  connections  between  Backend  and  Controller 
CC= 

DM= 

nG= 

P_P(XB= 

RECP= 

TI= 

DIO= 

G_PCLC= 

PP= 

P_PCLC= 

REQP= 
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.<each  control  executable  in  CNTRL/>.pid  -  (not  used)  process  ID  of  last  run  of  that 
process 

.alias  -  aliases  to  facilitate  navigation  in  the  directory  structure  as  well  as  the  b^n 
MDBS  alias  which  must  be  updated  if  the  current  version  is  moved  or  renamed.  At  the  time 
of  this  writing,  begin  is  defined  as:  alias  begin  ‘cd  /u/nxlbs/greg/run;  main’ 

.cshrc 

.login 
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B.  BACK  END 


The  second  part  of  the  file  sructure  of  Multibackend  Database  Supercomputer 
(MDBS)  to  be  described  is  the  back  end  file  structure.  Again  starting  with  the  directory 
structure  of  each  back  end,  see  Figure  8. 


/usr/work/mdbs 


Figure  8.  Directory  Structure  on  Backend  Machines 

EACH  Back  end  has  the  following  files  and  directories: 

/usr/work/mdbs  -  login  directory  containing: 

.(bget,  bput,  cc,  dio,  dirman,  recp).pid  -  process  IDs  for  last  run  (not  used) 
be.  Version/  -  one  directory  per  version 

-  6  process  executable  file  names  (bget,  bput,  cc,  dio,  dirman,  recp).exe 
6  process_name.tr  files  for  stdout  and  stderr  while  running 
run.be* 

UserFiles/  -  copies  or  links  to  files  with  the  same  name  as  in  UserFiles/  on  the 
controller.  Both  files  must  exist  on  every  back  end  for  every  database  used  on  the  system. 
DB_NAME.r 
DB_NAME.t 

bin/  -  copies  of  files  or  Fnks  to  files  in  Version_Name/bin  may  be  included.  The 
following  must  exist  for  the  automatic  reconfiguration  and  start-up  program  to  work 
stop.cmd  -  from  Version_Name/bin 
exe.awk  -  from  Version_Name/run 
zero  -  from  Version_Name/bin 
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Sockets/  -  6  connections  between  Backend  and  Controller 


CC= 

010= 

DM= 

G_PCLB= 

P_PCLB= 

RECP= 
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APPENDIX  B.  HOW  TO  START  AND  RECONFIGURE  MDBS 


The  following  is  a  description  of  what  you  need  to  do  to  start  and  run  the  Multi¬ 
backend  Database  Supercomputer  (MDBS)  as  automated  by  my  program. 

Initial  Conditions: 

Login  as  mdbs  to  the  controller,  currently  dbS,  open  one  window  or  use  the  console 
for  the  controller  to  run  MDBS.  To  monitor  the  processes  and  check  the  meta  and  data 
disks,  you  should  rxterm  or  create  another  window  again  to  the  controller  and  once  for 
each  of  the  Back  Ends  you  plan  to  use.  This  can  be  done  by  being  logged  in  to  each  of 
the  applicable  consoles  and/or  have  multiple  windows  open  on  your  terminal,  one  for 
each  back  end  and  two  for  the  controller. 

A.  TO  START  MDBS 


At  the  UNIX  prompt  on  the  controller  type: 
start  or  begin 

When  you  do  this  you  are  changed  to  directory  greg/run  and  the  program  configure 
begins  execution.  K  the  start  or  begin  aliases  have  not  been  updated  at  the  last  version 
change,  then  an  error  message  will  be  displayed  stating  that  the  version  name 
“<name>”  from  .config.db  is  not  in  current  path.  If  this  message  occurs,  either  change 
the  alias  or  manually  cd  <Version_name>/run;  main. 

The  following  will  be  displayed  once  the  configure  program  starts  execution 


Welcome  to  MDBS,  today  is  Fri  Feb  19  12:54:05  PST  1993 


Check  the  time  each  file  was  last  compiled: 


-rwxrwxr-x 

1 

mdbs 

83649 

Nov 

5 

12:58 

. . /BE/bget .exe 

-rwxrwxr-x 

1 

mdbs 

83649 

Nov 

5 

12:58 

. . /BE/bput .exe 

-rwxrwxr-x 

1 

mdbs 

246357 

Nov 

5 

12:57 

. . /BE/cc .exe 

-rwxrwxr-x 

1 

mdbs 

89208 

Nov 

5 

12:57 

. ./BE/dio.exe 

-rwxrwxr-x 

1 

mdbs 

375581 

Nov 

5 

12:58 

. . /BE/dirman . exe 

-rwxrwxr-x 

1 

mdbs 

335559 

Nov 

5 

12:59 

.  /BE /recp.exe 

-rwxrwxr-x 

1 

mdbs 

91811 

Nov 

5 

12:52 

. . /CNTRL/cget . exe 

-rwxrwxr-x 

1 

mdbs 

91811 

Nov 

5 

12:52 

. . /CNTRL/cput . exe 

-rwxrwxr-x 

1 

mdbs 

56680 

Nov 

5 

12:52 

. ./CNTRL/iig.exe 

-rwxrwxr-x 

1 

mdbs 

56971 

Nov 

5 

12:53 

. . /CNTRL/pp . exe 

-rwxrwxr-x 

1 

mdbs 

84382 

Nov 

5 

12:53 

. . /CNTRL/reqp . exe 

-rwxrwxr-x 

1 

mdbs 

488244 

Nov 

5 

12:56 

. ./CNTRL/ti.exe 

There  should  be  12 

files  listed 

,  if 

not 

0 

c 

need  to  recompile 

Do  you  need  to  recompile  any  executable  and/or  copy  the  6 
executable  files  to  each  Back  End 
(bget,  bput,  cc,  dio,  dirman,  recp.exe)?  (y/n) 
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Most  of  the  time  you  will  enter  n  to  this  question. 

Next  you  will  be  shown  the  name  of  the  current  version,  which  machine  is  the  controller 
and  which  are  the  back  ends. 

The  Current  Configuration  is: 

Version  Name:  greg 

Controller:  dbS 

3  Back  Ends : 

db3 

db4 

db7 

WARNING:  All  data  will  be  lost  if  you  reconfigure 
Do  you  wish  to  reconfigure  the  Back  Ends?  (y/n) 


Normally  you  would  enter  n  so  that  you  will  use  the  same  back  end  configuration. 
Reconfiguring  is  discussed  in  the  next  section. 

If  you  do  not  reconfigure,  then  you  will  be  asked  if  you  want  to  use  the  last  data  base 
loaded  on  the  system.  Unless  you  know  which  database  was  used  last  and  you  want  to 
use  that  database,  enter  n.  If  you  answer  y,  all  data  and  meta  data  will  be  erased. 

Do  you  wish  to  use  current  database?  (y/n) 

If  you  enter  n,  the  following  will  be  displayed  for  each  back  end: 

Zeroing  backend  meta  disk  on  back  end,  db4 . . . 

File  to  'zero  =  /dev/sdlc 
File  size  =  105638400 
Bytes  to  zero  =  1000000 
Bytes  written... 

819200 

1000000 

Zeroing  backend  data  disk  on  back  end,  db4 . . . 

File  to  zero  =  /dev/smOc 
File  size  =  418775040 
Bytes  to  zero  =  100000 
Bytes  written... 

100000 

Then  for  the  controller,  the  following  will  appear  with  a  possible  “No  match”  or  two. 

Removing  CINBT  and  IIG  AT  tables  on  controller,  db8 . . . 

Removing  sql,  hie,  and  catalog  files 


Now  that  the  configuration  is  determined,  you  are  given  the  opportunity  to  not  run 
MDBS.  This  option  is  here  in  case  someone  wants  to  reconfigure,  stop  unwanted  jobs, 
recompile  code,  or  copy  code  to  back  ends  without  actually  running  MDBS. 
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Do  you  wish  to  run  the  Multi  Model,  Multi  Lingual, 

Multi  Backended  Database  System?  (y/n) 


Entering  n  will  exit  you  from  this  program  and  return  you  to  the  UNIX  shell.  Entering 
y  will  cause  the  following  steps  to  occur: 

Configure  will  check  the  file  .config.db  to  see  the  last  combination  of  backends.  Any 
process  running  on  the  controller  or  backend  will  be  terminated  (with  the  shell  scripts 
named:  stop.<machine_name>).  For  the  configuration  in  this  example  and  no  duplicate 
processes  still  ranning,  the  following  will  be  displayed. 


stopping  processes  on  back  end  db3 
no  processes  to  kill  on  db3 
stopping  processes  on  back  end  db4 
no  processes  to  kill  on  db4 
stopping  processes  on  back  end  db7 
no  processes  to  kill  on  db7 
stopping  processes  on  dbS,  the  controller 
no  processes  to  kill  on  dbS 


Then  5  of  the  6  processes  on  the  controller  are  started  in  the  background. 


starting  5  of  6  controller  processes  on  dbS... 

[1]  2510 

[2]  2511 

[3]  2512 

[4]  2513 

[5]  2514- 

For  each  back  end,  six  processes  are  started. 


Running  backend  on  db3 . . . 

[1]  1358 

[2]  1359 

[3]  1360 

[4]  1361 

[5]  1362 

[6]  1363 

Running  backend  on  db4 . . . 

[1]  2302 

[2]  2303 

[3]  2304 

[4]  2305 

[5]  2306 

[6]  2307 

Running  backend  on  db7 . . . 

[1]  808 

[2]  809 

[3]  810 

[4]  811 

[5]  812 

[6]  813 


50 


At  this  point,  the  user  interface  begins. 

Note  that  the  6th  process  on  the  controller,  ti.exe,  does  not  start  right  away.  It  must 
communicate  with  the  other  5  processes  and  usually  dies  if  any  of  the  other  processes 
are  not  ready  to  communicate  via  the  Sockets. 

You  can  verify  using  each  window  (or  terminal  as  applicable,  described  above  in  the 
initial  conditions)  that  all  6  processes  are  running  on  each  back  end. 

For  each  Back  End  type; 

When  you  are  logged  in  (or  rlogin)  to  that  machine:  ps  or  ps  |  grep  exe 
When  you  have  a  window  on  another  machine  then;  rsh  <Back  End  Name>  ps 


To  verify  the  6  processes  are  running  on  the  controller  type: 
ps  or  ps  I  grep  exe 

If  any  process  is  not  running  on  any  machine  in  the  current  configuration,  then  you  must 
exit  Ae  main  program  on  the  controller  by  typing  ‘x’  at  the  first  prompt.  Otherwise, 
continue  to  run  MDBS  in  accordance  with  Ae  User’s  Manual  which  can  be  found  in 
(Bourgeois,  1993). 
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B.  TO  RECONFIGURE  MDBS 


Initial  Conditions  are  described  at  the  beginning  of  this  Appendix. 

At  the  UNIX  prompt  on  the  controller  type: 
start  -c  or  begin  -c 

Follow  the  instructions  for  To  START  MDBS.  If  you  forget  the  -c  then  answer  y  to  the 
following  WARNING  and  question. 


WARNING:  All  data  will  be  lost  if  you  reconfigure 
Do  you  wish  to  reconfigure  the  Back  Ends?  (y/n) 


Enter  y  to  each  backend  that  you  want  in  the  new  configuration.  Once  you  confirm 
your  choice,  the  .config.db  file  will  be  updated  and  each  machine  in  the  new 
configuration  will  be  zeroed  for  loading  a  new  database. 

Indicate  which  Back  Ends  you  want  in  the  configuration 
by  responding  y/n  as  each  Back  End  is  listed: 

db3  ?  y 

db4  ?  n 

db6  ?  n 

db7  ?  y 

db9  ?  n 

The  Current  Configuration  is: 

Version  Name:  greg 

Controller:  dbS 

2  Back  Ends: 

db3 

db7 

Is  this  configuration  satisfactory?  (y/n) 

Once  you  confirm  your  choice  with  a  y,  the  .config.db  file  will  be  updated  and  each 
machine  in  the  new  configuration  will  be  zeroed  for  loading  a  new  database.  A  sample 
zero  for  one  back  end  and  for  the  controller  is  shown  in  the  previous  section.  To 
START  MDBS.  After  zeroing  is  complete,  you  will  be  given  the  option  to  continue  or 
exit  MDBS. 

Do  you  wish  to  run  the  Multi  Model,  Multi  Lingual, 

Multi  Backended  Database  System?  (y/n) 

Enter  y  to  start  up  the  system  with  the  new  configuration.  For  a  description  of  what 
happens  next,  refw  to  the  previous  section.  To  START  MDBS. 
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C.  TO  DISTRIBUTE  BACKEND  EXECUTABLE  CODE 


Initial  Conditions: 

-  The  same  as  described  at  the  beginning  of  this  appendix,  plus 

-  you  are  in  the  directory  for  the  current  version  since  the  code  will  be  copied  to  the 
directory  corresponding  to  the  current  version  on  each  back  end. 

To  tell  the  configure  program  that  you  want  to  distribute  executable  back  end  code  to  a 
new  or  existing  back  end.  At  the  UNIX  prompt  enter: 

start -d  or  begin  -d 

If  you  forget  to  use  the  -d  option,  you  can  answer  y  to  the  first  question  as  described 
above  in  START  MDBS. 


Welcome  to  MDBS,  today  is  Fri  Feb  19  14:48:48  PST  1993 


Check  the  time  each  file  was  last  compiled: 


-rwxrwxr-x 

1 

mdbs 

83649 

Nov 

5 

12:58 

. . /BE/bget .exe 

-rwxrwxr-x 

1 

mdbs 

83649 

Nov 

5 

12:58 

. . /BE/bput . exe 

-rwxrwxr-x 

1 

mdbs 

246357 

Nov 

5 

12:57 

. . /BE/cc.exe 

-rwxrwxr-x 

1 

mdbs 

89208 

Nov 

5 

12:57 

. . /BE/dio.exe 

-rwxrwxr-x 

1 

mdbs 

375581 

Nov 

5 

12:58 

. . /BE/dirman . exe 

-rwxrwxr-x 

1 

mdbs 

335559 

Nov 

5 

12:59 

. . /BE/ recp.exe 

-rwxrwxr-x 

1 

mdbs 

91811 

Nov 

5 

12:52 

. . /CNTRL/cget .exe 

-rwxrwxr-x 

1 

mdbs 

91811 

Nov 

5 

12:52 

. . /CNTRL/cput . exe 

-rwxrwxr-x 

1 

mdbs 

56680 

Nov 

5 

12:52 

. . /CNTRL/iig.exe 

-rwxrwxr-x 

1 

mdbs 

56971 

Nov 

5 

12:53 

. . /CNTRL/pp . exe 

-rwxrwxr-x 

1 

mdbs 

84382 

Nov 

5 

12:53 

. . /CNTRL/reqp . exe 

-rwxrwxr-x 

1 

mdbs 

488244 

Nov 

5 

12:56 

. ./CNTRL/ti.exe 

There  should  be  12  files  listed,  if  not  you  need  to  recompile. 
Do  you  need  to  recompile  before  copying  to  the  back  ends?  (y/n) 


If  the  code  has  not  been  modified  since  the  last  compilation,  then  enter  n.  Entering  y 
will  execute  a  ‘make  all’  on  the  system  which  may  take  several  minutes  to  complete.  It 
is  much  safer  to  execute  ‘make  aU’  from  outside  of  this  program  so  that  you  can  verify 
the  compilation  was  successful  before  copying  any  code  to  the  back  ends,  as  described 
in  Appendix  F. 


After  all  6  BE  processes  are  copied  to  every  back  end  defined  in  the  file,  configure.h., 
the  configuration  is  displayed  and  you  are  asked  if  you  want  to  reconfigure. 
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The  Current  Configuration  is: 


Version  Name;  greg 
Controller:  db8 

3  Back  Ends : 

db3 

db4 

db7 

WARNING:  All  data  will  be  lost  if  you  reconfigure 

Do  you  wish  to  reconfigure  the  Back  Ends?  (y/n) 

From  here  on  refer  to  the  previous  section,  TO  START  MDBS. 
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D.  TO  CHANGE  VERSIONS 


note:  disk  space  is  limited  on  almost  any  system.  Also  the  more  versions  in  existence, 
the  harder  it  is  to  integrate  changes  on  older  versions  into  the  newer  version(s). 
Minimize  the  number  of  copies  and  versions  of  code. 

Initial  Conditions: 

-  The  same  as  described  in  To  START  MDBS  plus 

-  R^licadon  of  Ae  entire^  controller  directory  tree  has  been  completed  from  an 
existing  version.^ 

After  changing  the  following  alias  so  that  it  changes  directory^  to  the  new  version,  at 
the  UNIX  prompt  on  the  controller,  type: 

start  -V  or  begin  -v 

This  will  display  similar  information  as  described  above  in  To  START  MDBS  except 
the  first  prompt  will  be  asking  you  to  confirm  that  you  want  to  change  the  version. 

The  Current  Configuration  is: 

Version  Name:  greg 

Controller:  dbS 

3  Back  Ends: 

db3 

clb4 

db7 

WARNING: (1)  changing  the  version  should  only  be  done  for  major  changes. 
-(2)  Please  consult  Dr.  Hsiao  or  all  project  members  first. 

(3)  This  program  MUST  be  run  from  the  new  version  directory 
e.g.  from  directory  /u/mdbs/VERSION/run . 

pwd  ==  /u/radbs/greg/run 


Do  you  still  want  to  change  the  current  version?  (y/n) 


Note  that  a  partial  copy  of  the  code  can  be  made  as  long  as  the  associated  makefile(s)  for  that 
copy  are  modified  to  find  the  files  fiom  another  version  that  were  not  copied  or  modified.  When 
devek^g  the  Object-Oriented  Data  Language  Interface,  only  the  CNTRlVLanglF  directory  was 
copied  on  db3  to  the  rich  directory.  The  controller  process  ti.out  was  the  only  process  of  twelve  to 
be  modified.  The  other  eleven  processes  from  cs4322A''erE.6  were  used  by  simply  modifying  the 
run  command  in  mdbs/rich/run  directory 

For  example,  if  you  wanted  to  copy  fte  entire  version  7  firom  db4,  you  could  execute  the  follow¬ 
ing  command  from  die  login  directory  of  the  (new)  controller 
cp  -r  /@db4/u/mdbs/7  NEW.VERSION 

The  configure  program  will  check  die  present  working  directory  for  the  new  version 
name,  if  it  is  not  present,  the  program  will  display  an  error  message  and  stq>. 
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If  you  enter  y,  you  will  be  asked  to  enter  and  verify  the  name  of  the  new  version 

Enter  the  new  version  name:  greg 

Please  verify  the  new  version  name  to  be:  greg 

Is  this  correct?  (y/n) (q/Q  to  Quit)  y 

Y ou  can  exit  the  version  change  process  by  entering  q  or  Q.  You  can  enter  n  to  re-enter 
the  new  version  name.  When  you  enter  y,  you  will  be  asked  if  the  Back  End  directory 
be.NEW_VERSION  has  been  set  up.^ 

pwd  includes  new  version,  updating  .config.db 
Have  Back  Ends  been  set  up  for  this  version?  (y/n) 

After  answering  y  or  n,  the  current  configuration  is  displayed  with  the  updated  Version 
name. 


The  Current 

Configuration  is; 

Version  Name 

:  thesis 

Controller: 

dbS 

3  Back  Ends: 

db3 

db4 

db7 

WARNING:  All  data  will  be  lost  if  you  reconfigure 
Do  you  wish  to  reconfigure  the  Back  Ends?  (y/n) 

From  this  point  on,  this  program  continues  as  previously  described  in  To  START 
MDBS. 

WARNING:  If  you  did  not  run  the  configure  program  from  the  directory  of  the  new 
version,  then  this  program  will  terminate  with  an  &ror  message  and  instructions  telling 
you  what  needs  to  be  done  to  continue. 


'*■  Set-up  includes  creating  the  directory  be.  VERSION,  creating  a  new  iun.be  file  for  the  VER¬ 
SION,  copying  the  new  run.be  to  be.  VERS  ION,  and  copying  the  executable  code  for  the  6  Back 
End  processes  to  be. VERSION. 
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E.  TO  LOAD  MDBS  ON  A  NEW  MACHINE 


Initial  Conditions: 

-  mdbs  account  exists  in  /usr/work/mdbs  directory. 

-  the  current  mdbs  machines  and  account  can  access  the  new  machine  via  “rep  dbn:/” 
where  n  is  the  number  of  the  new  machine. 

-  configure.h  has  been  updated  to  include  the  new  machine  as  either  a  back_end[]  or  as 
the  controller  and  main.c  which  includes  this  jWe  has  been  recompiled 

-  the  CPU  and  operating  system  are  identical^ 


As  a  CONTROLLER 

Copy®  the  entire  directory  tree '  which  is  described  in  Appendix  A 

Look  at  pcl.def  file  which  lists  two  socket  addresses  and  the  controller  name  used  by 
mdbs.. 

As  a  BACK  END 

mkdir  /u/mdbs/bin  and  copy  from  another  back  end: 

stop.cmd,  awk.exe,  and  zero.  If  the  new  machine  is  not  identical  to  the  one  you 
copied  from,  then  you  must  reompile  the  zero  file  on  the  new  machine  using  the  code 
from  the  BE/DIO  subdirectory 
mkdir  /u/mdbs/be.<CURRENT_VERSION> 

copy  executables  and  run.be  from  the  same  directory  on  another  back  end. 

mkdir  /u/mdbs/Sockets 

mkdir  /u/mdbs/UserFiles 


If  they  are  not  then  a  detailed  evaluation  of  the  source  code  for  system  calls  must  be  done.  Refer 
to  the  masters  thesis  of  MAJ  Stan  Waddns  to  be  completed  by  September  1993. 

The  entire  code  can  be  copied  from  another  machine,  say  dbS,  by  starting  at  the  login  directory 
of  the  new  machine  and  typing:  rep  -r  db8:/u/mdbs/greg  NEW_VERSION 
It  is  possible  to  only  cc^y  the  6  controllCT  processes,  atal  the  NEW_VER.SION/nin  files;  how¬ 
ever,  you  will  need  more  if  you  plan  on  doing  any  revisions  to  the  code  on  the  new  machine. 
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F.  SAMPLE  ERROR  MESSAGES 


1.  Invalid  Option 

If  you  forget  what  the  options  are  or  what  they  do,  enter  an  invalid  one  such 
as  start  -h  or  begin  -q.  You  will  get  the  following  response: 

Welcome  to  MDBS,  today  is  Tue  Mar  2  12:56:47  PST  1993 

unrecognized  option  h,  valid  options:  [-c  -v  -d] 

-c  change  Back  End  CONFIGURATION 

-d  copy  (DISTRIBUTE)  executable  code  to  Back  Ends 
-V  change  VERSION  name 

2.  Incorrect  new  version  name 

If  you  change  the  version  name  (e.g.  to  ‘7’)  in  a  directory  that  does  not 
contain  the  version  name  you  will  see  (after  you  confirm  the  new  version  name): 

ERROR:  7  not  found  in  pwd. 

.config.db  should  only  be  updated  in  /mdbs/7/run  directory, 
.config.db  not  updated. 


3.  Incorrect  version  for  pwd 

If  you  try  to  run  mdbs  with  the  version  in  .config.db  not  in  pwd  (present 
working  directory),  you  will  get  the  error  message  displayed  below. 

Do  you  wish  to  run  the  Multi  Model,  Multi  Lingual, 

Multi  Backended  Database  System?  (y/n)  y 


ERROR:  version  name  '7*  from  .config.db 

is  not  in  current  path:  /u/mdbs /greg/run 

If  version  name  is  not  correct,  restart  with  -v  option 
to  correct  version  naune. 

If  version  name  is  correct,  then  you  are  running  main 
from  the  wrong  directory.  You  must  either 
run  main  from  /mdbs/7/run  or 

change  begin  /  mdbs  /  start  alias  to  cd  to  /mdbs/7/run 

When  changing  versions,  remember  to  verify  default_version  is 
correct  in  configure. h.  If  a  change  is  made, 
recompile  main.c  then  copy  main  to  /mdbs/7/run/main 

Exiting  configure  program. 
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4.  Include  file  out  of  date 

If  you  try  to  run  a  new  version  from  the  correct  directory,  but  forget  to 
change  the  initialization  of  the  global  variable  ‘default_version’  in  configure.h  to  equal  the 
new  version  name,  you  will  get  the  following  error  message: 


ERROR:  default_version  "greg*  does  not  match 

current_version  "andy" 

If  current_version  is  correct, 

Go  to  /mdbs /andy /bin  directory 

In  configure.h  correct  default_version 

compile  main.c  by  typing  'make  auto' 

makefile  will  automatically  copy  main  to  run/main 

If  current_version  is  not  correct,  restart  with  -v  option 
to  correct  version  name. 

Exiting  configure  program. 


5.  Missing  file  configuration  file 

If  the  .config.db  file  is  missing  from  the  current  directory,  you  will  see  the 
following  WARNING: 


WARNING;  unable  to  read  .config.db 

Configuring  for  1  Back  End:  db4 
Controller:  dbS 

Version:  mdbs 


6.  Missing  or  Incorrect  process  names 

If  a  file  is  missing  or  you  did  not  change  the  make  file  in  /u/mdbs/ 
VERSION/  (CNTRL  &  BE)  to  name  the  12  executable  files  to  end  in  “.exe:  then  you 
will  see: 

starLcntrl:  ../CNTRL/<process_name>.exe;  not  found 


for  each  back  end 
run.be:  bgetexe;  not  found 
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APPENDIX  C.  ON-LINE  DOCUMENTATION 


db8  /usr/work/mdbs/Docs 


cc/ 

com/ 

dio/ 

dm/ 

iig/ 

pp/ 

recp/ 

reqp/ 

sndrcv/ 

tmp1/ 

Figure  9:  Documentation  of  MDBS  design  and  structures 

The  documentation  files  are  mostly  text  files  with  occasional  troff  formatted  files  which 
contain  formatting  commands  and  can  be  read  as  text.  There  are  some  diagrams  in  the 
sub-directories  of  Struct  which  are  text  files. 

Map/ 

A  file  for  some  of  the  12  processes  listing  the  function  names,  source  file 
where  function  can  be  found,  and  the  purpose  of  the  function.  The  latest  revision  was 
done  in  May  of  1988. 

cc.map  -  concurrency  control 

common.map  - 

com_msg.map  - 

dio.map  -  disk  i/o  process 

dm.map  -  directory  management  process 

lig.map  -  process  iig; 

pp.map  -  post  processing 

recp.map  -  record  processing 

reqp.map  -  request  preparation 

ti.map  -  test  interface 
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Msg/ 

A  file  for  each  type  of  message  used  for  inter  process  communications.  Each 
file  contains  type,  sender  process,  receiver  process,  source  file,  and  purpose. 

guide  -  contains  information  on  the  format  of  the  files  in  this  directory, 
toe  -  contains  the  table  of  contents  for  this  directory. 


Pcode/ 

Pseudo  code  explaining  the  general  sequence  of  events  performed  by  each 
process.  The  code  is  not  complete,  and  was  probably  written  in  1988. 

Communication.pc 

cc.pc 

common.pc 

dio.pc 

dm.pc 

grecp.pc 

iig.pc 

recp.pc 

reqp.pc 

ti,pc 


Struct/ 

cc/ 

com/ 

dio/ 

dm/ 

iig/ 

pic 

picl 

pic2 

PP/ 

print_them*  -  a  useful  shell  senpt  to  print  all  the  *.s  files 

reep/ 

reqp/ 

sndrev/ 

tmpl/ 
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APPENDIX  D.  FLOW  CHARTS  FOR  SYSTEM  GENERATION 

SOFTWARE 


The  following  four  pages  describe  the  control  flow  of  the  software  I  developed  to 
make  MDBS  a  more  manageable  system.  The  first  page  is  an  overview  of  the  whole 
program.  It  describes  the  main()  procedure.  The  following  three  pages  show  more  detail  of 
three  of  the  functions  called  by  the  main  procedure.  The  three  functions  are 
get_configuration(),  reconfigure(),  and  change_version(). 
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geucoafiforetioo 


Mi_pichs 
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APPENDIX  E.  SOURCE  CODE  FOR  SYSTEM  GENERATION 

SOFTWARE 


The  following  code  is  found  in  Version_Nanie/bin  and  is  compiled  by  being  in  that 
directory  on  db8  or  dbl  1  and  typing  ‘make  auto’.  Auto  is  short  for  automatic  reconfiguration 
and  stiut-up  of  the  Multibackend  Database  Supercomputer  (MDBS). 

The  first  file,  configure.h  is  the  include  file  for  main.c  on  db8.  The  second  file, 
configure.h  is  the  include  file  for  main.c  on  dbll.  The  third  file  is  main.c,  the  same  on  all 
machines. 
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/* - 

t  UNIT  NAME 


PRCXStAM 

DESCRIPTION 


CONFIGURATION  configure. h 

include  file  for  main.c 

sfored  in  directory;  current_version/bin 

Start  the  Multi-Lingual  Parallel  Distributed 
Database  program  with  varying  number  of  Back  Ends. 


files  needed  to  run  main.c: 

““>  on  Controller 
in  current  version/run/: 


zero.dbn 


stop.dbn 


start. cntrl 

master.run.be 

.config.db 

main 


shell  script  which  uses  rsh  to  call  the 
zero  command  on  the  named  back  end 
number  *n'  to  zero  meta  6  data  disks 
shell  script  which  uses  rsh  to  call  the 
stop.cmd  command  on  the  named  back  end 
nvunber  'n'  or  execute  on  controller 
to  stop  6  processes 
shell  script  which  starts  5  of  6 
controller  processes  in  background 
template  used  to  create  back  end  run.be 
current  version  name  followed  by  a 

list  of  back  ends  in  current  configuration 
this  code  compiled 


in  current_version/CNTRL/: 
all  6  controller  processes 


INPUTS 

OUTPUTS 

CREATED 

MODIFIED 

AUTHOR 

PROJECT 

PURPOSE 


System 

Compiler 


”">  on  each  Back  End  <“- 
in  /u/mdbs/be.current_version/: 

*.exe  :  all  6  back  end  processes 

run.be  :  shell  script  to  start  6  Back  End  processes 
trace  files  for  all  6  processes  (will  be  created) 

in  /u/mdbs/bin/: 

zero  :  C  program  to  erase  memory 

(source  in  current_version/BE/DIO) 
stop.cmd  :  shell  script  to  kill  last  6  pid's  run 
exe.awk  :  called  by  stop.cmd,  pattern  match  *.exe  in  ps  x 

The  code  in  this  program  is  represented  pictorially,  see  thesis 

stdin,  .config.db  file 
stdout 

19  October  1992 

many  times  through  May  1993 

Andrew  P.  Meeks 

AUTOMATE  the  START-UP  process  of  the  Multi-Modal, 

Multi-Back  ended,  Multi-Lingual  Database  System 

UNIX 

cc 

- */ 


/*  IF  HARDWARE  CONFIGURATION  CHANGES  (equipment  added,  fixed,  removed,  breaks, 
etc)  you  must  review  the  following  7  lines  of  code  for  change  */ 


/*  enum  all_isivs  {  dbl,  db2,  db3,  db4,  dbS,  db6,  db7,  dbS,  db9  }  ; 

char  *isiv  namesO  -  ("dbl",  "db2",  ”db3'',  ''db4",  "dbS",  "db6",  "db7",  "dbS",  "db9''} 

*/ 


/*  must  have  at  least  MAX_BACK_ENDS  ”  ”  defined  */ 

/*  the  length  of  the  blank  strTngs  must  be  at  least  as  long  as  the 

longest  back  end  name  */ 

/*  the  following  should  be  declared  inside  mainO,  but  the  conpiler 
will  not  allow  "automatic  aggregate  initialization" 
unless  declared  was  in  the  header  file  */ 

char  *current_BackEnds [ )  -  ("  "  ",  "  ",  "  "); 
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/*  uncoment  next  3  lines  for  use  with  dbl  at  Pt  Mugu: 
char  *be_naines[J  -  {"dbl"}; 

enum  Back_Ends  {  dbl  }; 

enum  Back_Ends  default_be  -  dbl; 

be  sure  to  comment  out  or  delete  subsequent  duplicate  declarations  */ 

/*  uncoment  next  3  lines  for  use  with  db2  at  SUPSHIP  San  Francisco: 
char  *be_naines  [  ]  ■■  { "db2" } ; 

enum  Back_Ends  {  db2  }; 

enum  Back_Ends  default_be  ••  db2; 

be  sure  to  comment  out  or  delete  subsequent  duplicate  declarations  */ 

char  *be_names[]  -  {"db3",  "db4",  "db6",  "db7",  "db9"}; 

enum  Back_Ends  {  db3,  db4,  db6,  db7,  db9  }; 

enum  Back_Ends  default_be  -  db4; 

char  ‘controller  -  "dbS";  /*  to  change,  need  to 

(1)  update  pcl.def  include  file 

(2)  reconcile  executable  code  */ 

/*  enum  Back_Ends  and  all^isivs  can  be  used 

to  index  be_names[]  and  isiv_names[]  */ 

/*  change  5  to  1  for  use  with  single  back  end  systems  dbl  and  db2  */ 

tdefine  MAX_BACK_ENDS  5 

tdefine  MAX_STRING  64  /*  used  for  ALL  non  constant  string  declarations  */ 

/*  IF  VERSION  IS  CHANGED  they  you  must  update  the  following  line  accordingly  */ 
char  default_version [MAX_STRING]  -  "mdbs";  /*  constant  */ 

/*  The  following  two  lines  must  correspond  to  output  of  "make  all"  in 
/u/mdbs/VERTION/bin  directory.  Actual  files  created  in  the  makefiles 
of  each  process  */ 

char  *BE_process[]  -  {"bget.exe",  "bput.exe",  "cc.exe", 

"dio.exe",  "dirroan.exe",  "recp.exe"}; 

char  *CNTRL_process []  -  ("ti.exe  ",  "cget.exe",  "cput.exe", 

"iig.exe",  "pp.exe",  "reqp.exe"  }; 

/*  "ti.exe  ”  includes  a  space  since  it  is  called  with  a  parameter  */ 
int  N_PROCESSES  -  6; 

tdefine  FALSE  0 
tdefine  TRUE  1 
tdefine  ERROR  1 
tdefine  NO_ERROR  0 

tdefine  NO  0 
tdefine  YES  1 

/*  STATUS  of  equipment  11/6/92: 

dbl  Pt .  Mugu 

db2  SUPSHIP  San  Francisco 

db3  back  end,  working 

db4  back  end,  working 

dbS  no  meta  or  data  disk 

db6  no  data  disk 

db7  back  end,  working 

dbS  controller,  working 

db9  no  data  disk 

*/ 

/*  eof;  configure. h  */ 
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/*- 


UNIT  NAME 


PROGRAM 

DESCRIPTION 


INPUTS 

OUTPUTS 

CREATED 

MODIFIED 

AUTHOR 

PROJECT 

PURPOSE 

System 

Compiler 


CONFIGURATION 
include  file  for  main.c 

stored  in  directory: 


configure. h 
current  version /bin 


Start  the  Multi-Lingual  Parallel  Distributed 
Database  program  with  varying  number  of  Back  Ends. 

files  needed  to  run  main.c: 

-“>  on  Controller  <— 
in  current  version/run/: 


zero . dbn 


stop.dbn 


start .cntrl 

master.run.be 
. conf ig.db 


shell  script  which  uses  rsh  to  call  the 
zero  connand  on  the  named  back  end 
number  'n'  to  zero  meta  &  data  disks 
shell  script  which  uses  rsh  to  call  the 
stop.cmd  command  on  the  nemied  back  end 
number  'n'  or  execute  on  controller 
to  stop  6  processes 
shell  script  which  starts  5  of  6 
controller  processes  in  background 
template  used  to  create  back  end  run.be 
current  version  name  followed  by  a 

list  of  back  ends  in  current  configuration 
this  code  compiled 


in  current_version/CNTRL/: 
all  6  controller  processes 


”“>  on  each  Back  End 
in  /u/mdbs/be.current_version/: 

*.exe  :  all  6  back  end  processes 

run.be  :  shell  script  to  start  6  Back  End  processes 
trace  files  for  all  6  processes  (will  be  created) 

in  /u/mdbs/bin/: 

zero  :  C  program  to  erase  memory 

(source  in  current_version/BE/DIO) 
stop.cmd  :  shell  script  to  kill  last  6  pid's  run 
exe.awk  :  called  by  stop.cmd,  pattern  match  *.exe  in  ps  x 

The  code  in  this  program  is  represented  pictorially,  see  thesis 

stdin,  .conf ig.db  file 
stdout 

19  October  1992 

27  April  1993  (by  S. Watkins  for  new  Sun  hardware) 

Andrew  P .  Meeks 

AUTOMATE  the  START-UP  process  of  the  Multi-Modal, 

Multi-Back  ended,  Multi-Lingual  Database  System 

UNIX 


/*  IF  HARDWARE  CONFIGURATION  CHANGES  (equipment  added,  fixed,  removed,  breaks, 
etc)  you  must  review  the  following  7  lines  of  code  for  change  */ 

/*  enum  all_isivs  (  dbll,  dbl2,  dbl3  ) ; 

char  *i8iv_names[l  -  ("dbll",  "dbl2",  "dbl3");  */ 

/*  must  have  at  least  MAX_BACK_ENDS  "  "  defined  */ 

/*  the  length  of  the  blank  strings  must  be  at  least  as  long  as  the 

longest  back  end  name  */ 

/*  the  following  should  be  declared  inside  main(),  but  the  compiler 
will  not  allow  "automatic  aggregate  initialization" 
unless  declared  was  in  the  header  file  */ 


char  *current_BackEnds [ ]  -  ("  ", 

char  *be_name8[]  -  ("dbl2". 


''dbl3"); 
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dbl3  }; 


enum  Back_Ends  (  dbl2, 

enum  Back_Ends  de£ault_be  >  dbl3; 

char  *controller  -  "dbll";  /*  to  change,  need  to 

(1)  update  pcl.def  include  file 

(2)  recompile  executable  code  */ 

/*  enum  Back_Ends  and  all_isivs  can  be  used 

to  index  be_names[]  and  isiv_names[]  */ 

♦define  MAX_BACK_ENDS  2 

♦define  MAX_STRING  64  /*  used  for  ALL  non  constant  string  declarations  */ 

/*  IF  VERSION  IS  CHANGED  they  you  must  update  the  following  line  accordingly  */ 
char  default_version [MAX_STRING]  -  "greg"; 


char  current_version (MAX_STRING] ;  /*  read  in  from  .config.db  file  */ 

char  BEjpath[MAX_STRINGl;  /*  see  set_path  ();  '"/u/mdbs/be. VERSION/";  */ 

char  CNTRLjpath  [MAX_STRING] ;  /*  see  setjpath  ()  ;  "/u/mdbs/VERSION/CNTRL/" ;  */ 

/*  The  following  two  lines  must  correspond  to  output  of  "make  all"  in 
/u/mdbs/VERTION/bin  directory.  Actual  files  created  in  the  makefiles 
of  each  process  */ 

char  *BE_process [ )  •  {"bget.exe",  "bput.exe",  "cc.exe", 

"dio.exe",  "dirman.exe",  "recp.exe"); 


char  *CNTRL_process [1  =  {"ti.exe  ",  "cget.exe",  "cput.exe", 

"iig.exe",  "pp.exe",  "reqp.exe"  ); 

/*  "ti.exe  "  includes  a  space  since  it  is  called  with  a  parameter  */ 
int  N_PROCESSES  -  6; 

♦define  FALSE  0 
♦define  TRUE  1 
♦define  ERROR  1 
♦define  NO_ERROR  0 

♦define  NO  0 
♦define  YES  1 

/*  STATUS  of  old  ISIV  equipment  930427: 


dbl 

Pt .  Mugu 

db2 

SUPSHIP  San 

Francisco 

db3 

back  end. 

working 

db4 

back  end. 

working 

dbS 

no  meta  disk 

db6 

no  data  disk 

db7 

back  end. 

working 

dbS 

controller. 

working 

db9 

no  data  disk 

*/ 

/*  eof:  configure. h  */ 
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/* - 

I  ONIT_NAME 
I 

I  PROGRAM 
I  DESCRIPTION 


CONFIGURE  main.c 

file  stored  in  directory:  VERSION/bin 

start  the  Multi-Lingual,  Parallel,  Distributed 
Database  program  with  varying  number  of  Back  Ends. 

GENERAL  OUTLINE: 

check  if  called  with  -c,  -v,  -d  options  ';heck_f or_flags  () 
read  previous  configuration  get_configuration  () 

set  controller  and  back  end  paths  set_paths  () 

display  configuration  show_conf iguration  () 

if  -V  option  used 

get  new  version  name  change_version  () 

if  -c  option  not  used,  ask  if  user  wants  to  reconfigure 
WARNING:  will  erase  database 
if  YES  or  -c  option  used 

erase  previous  database  data 
get  new  configuration 
save  new  configuration 
if  NO  and  -c  option  NOT  used 

ask  if  user  wants  to  use  present 
if  NO 

erase  previous  database  data 
kill  all  lingering  processes 
start  5  Controller  processes 
start  all  Back  End  processes 
start  ti  on  coiitroller 


zero  0 

get_conf iguration  () 


database 
ze:  >  0 

system  (stop. machine) 
system  (start .cntrl) 
system  (rsh  run.be) 
system  (ti.exe  #) 


INPUTS 

OUTPUTS 

CREATED 

MODIFIED 

AUTHOR 

PROJECT 

PURPOSE 

System 

Compiler 


NOTES: 

Version  can  ONLY  be  changed  f  main  called  with  -v  option 
Controller  is  defined  in  "configure. h"  to  be  dbS 
Available  Back  Ends  are  defined  in  "configure. h" 

The  code  in  this  program  is  represented  pictorially,  see  thesis 

WARNING:  revise  configure. h  with  each  version  change 

stdin,  .config.db  file 
stdout 

19  October  1992 

many  times  through  March  1993 

Andrew  P .  Meeks 

MDBS 

get  multi-model,  multi -oackended,  multi-lingual  database 
system  to  work 

UNIX 

cc 

- */ 


tinclude  <strings.h> 
#include  <stdio.h> 
tinclude  "configure. h" 
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*  CHECK_FOR_FLAGS  evaluate  conmand  line  parameters 

* 


*  parameters:  argc,  argv  (in)  same  as  in  mainO 

*  pNew_ver  ^n  out)  whether  -v  option  used 

*  used  for  version  change 

*  pNew_config  (in  out)  whether  -c  option  used 

*  used  for  configuration  change 

*  pDist_code  (in  out)  whether  -d  option  used 

*  used  to  copy  code  to  BE's 

*  .  you  can  only  used  one  option  per  sign 

* 

*  return  FALSE  if  no  invalid  options  used  to  call  this  program 

*  ERROR  if  an  .invalid  option  was  used 
*/ 

int  check_for_flags  (argc,  argv,  pNew_ver,  pNew_config,  pDist_code  ) 
int  argc; 
char  *argv [ 1 ; 

int  *pNew_ver,  *pNew_config,  *pDist_code  ; 


char  *pCH; 
int  teirp  -  1; 

if  (temp  <  argc) 

{ 

while  (tenp  <  argc) 

( 

pCH  “  argv  [temp] 
if  (*pCH  ==■  '-') 


/*  pointer  to  parameter  string 


/*  at  least  one  parameter  used  */ 
/*  another  parameter  to  process  */ 


/*  first  character  must  be  a 


pCH  ++; 
switch  (*pCH) 


/*  remember  which  option (s)  used  */ 


case  'c'  : 

*pNew_config 

break; 

-  TRUE; 

case  'V'  : 

*pNew_ver 

break; 

-  TRUE; 

case  'd'  : 

*pDist_code 

break; 

=  TRUE; 

default  : 

printf  ("unrecognized  option  %c,  ",  *90!) ; 

printf  ('valid  options:  [-c  -v  -d)\n\n"); 
printf  C'-c  change  Back  End  CONFIGORATIONXn") ; 
printf  ("-d  copy  (DISTRIBUTE)  executable  ") ; 

printf  ("code  to  Back  Ends\n"); 
printf  ("-V  change  VERSION  name\n\n") ; 
return  (ERROR) ; 
break; 


temp  ++; 


/*  process  next  option 


return  (FALSE); 

/*  end  check_for_flags  ()  */ 
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/*  - 

*  NOT_IN_PWD  return  TRUE  if  string  (Version  name)  in  PWD  string 

* 

*  parameters:  pwd  (in)  current  wor](ing  directory 

*  current_version  (in)  version  name 

* 

*  return:  TRUE  if  cuirent_version  found  in  pwd  string 

*  FALSE  if  current_version  not  found  in  pwd  string 

*/ 

int  not_in_pwd  (pwd,  current_version) 
char  *pwd; 

char  ‘current  version; 

char  *pch  -  pwd;  /*  pointer  to  current  portion  of  pwd  */ 

int  length  -  strlen  (current_version) ; 

while  (*pch) 

{ 

pch++;  /*  increment  pointer  to  character  after  '/'  */ 

if  (strncnp  (pch,  current  version,  length) ) 

{ 

/*  scan  path  for  next  '/'  but  not  past  end  of  string  */ 
while  (*pch  !=  '/'  &&  *pch) 

++pch; 

} 

else 

{  /*  found  current_version  in  pwd  */ 

return  (FALSE) ; 

} 

) 

return  (TRUE) ; 

)  /*  end  not_injpwd  ()  */ 


/*  - 

*  SHOW_COKFIGURATIONdisplay  version,  controller,  and  bac)c  ends  to  be  used 

* 

*  parameters:  n_Bac)cEnds  (in)  the  number  in  current  Configuration 

*  current_Bac)cEnds  (]  (in)  strings  of  each  bac)(  end  name 

*  current_version  ^n)  name  of  version  in  .config.db 

* 

*  return:  no  information 

*/ 

int  show_conf iguration  (n_Bac)cEnds,  current_BaclcEnds,  current_version) 

int  n_Bac)tEnds; 

char  *current_Bac)cEnds  [  ]  ; 

char  *current_version; 

{ 

int  i;  /*  array  index  */ 

printf  ("\nThe  Current  Configuration  is:  \n") ; 
printf  ("\nVersion  Name:\t%s  \n",  current_version) ; 
printf  ("\nController:\t%s  \n",  controller); 
if  (n_Bac]cEnds  «-  1) 

printf  ("\nl  Bac}c  End:  \n")  ; 

else 

printf  ("\n%d  Bac)?  End;:  \n",  n_Bac)cEnds) ; 

for  (i-0;  i<  n_BackEnds;  i++) 

printf  ("\t\t%s  \n",  current_Bac)cEnds  [i] )  ; 

) 
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/*  - 

*  DEFAOLT_CONFIG  configures  database  to  have  one  back  end 

* 

*  parameters:  pN_BackEnds  (out)  number  in  default  Configuration 

*  current_BackEnds [ ]  (in)  strings  of  each  back  end  name 

*  current_version  (in)  name  of  version  in  .config.db 

* 

*  return:  NO_ERROR  if  default  configuration  saved  in  .config.db 

*  ERROR  if  unable  to  open  .config.db 

*/ 

int  default_config  (pN_BackEnds,  current_BackEhids ,  current_version) 

int  *pN_BackEnds; 

char  *current_BackEnds [ ] ; 

char  *current_version; 

{ 

FILE  *pFILE;  /*  pointer  to  .config.db  file  */ 

/*  .config.db  does  not  exist  or  cannot  be  written  to  */ . 

printf  ("WARNING:  unable  to  read  .conf ig.db\n\n") ; 
printf  ("Configuring  for  1  Back  End:  %s\n",  be_names (default_be) ) ; 
printf  r'\t\tController :  %s\n",  controller)  ; 

printf  ("\t\tVersion:  %s\n",  default_version)  ; 

♦pNJBackEnds  =  1; 

/*  initialize  current_version  */ 

strcpy  (current_version,  default_version) ; 

strcpy  (current_BackEnds [0]  ,  be_names [default_be) ) ; 

if  (pFILE  «  fopen  (".config.db",  "w") ) 

( 

fprintf  (pFILE,  "%s\n",  default_version) ; 
fprintf  (pFILE,  "%s\n",  be_names  (default  be] ) ; 
fclose  (pFILE) ; 

} 

else 

{  . 

printf  ("unable  to  write  .config. db\n") ; 
return  (ERROR) ; 

) 

return  (NO_ERROR) ; 

)  /*  end  default_conf ig  ()  */ 


/* - 

*  GET_CONFIGDRATIONread  .config.db  file  from  pwd 

*  initialize  the  two  parameters: 


*  parameters : 

* 

* 

* 


pN_BackEnds 
current_BackEnds [ ] 
current  version 


(out)  number  of  Back  Ends  in  .config.db 

(in)  list  of  back  end  names 

(out)  name  of  version  in  .config.db 


*  return:  N0_EBR0R  if  valid  .config.db  exists  upon  exiting 

*  ERROR  if  unrecoverable  error  in  .config.db 
*/ 

int  get_conf iguration  (pN_BackEnds,  current_BackEnds,  current_version) 

int  *pN_BackEnds; 

char  •current_BackEnd8 [ ] ; 

cliar  ^current  version; 


( 


char  pch[MAX  STRING]; 

/* 

temporary  pointer 

to  string 

*/ 

FILE  *pFILE; 

/* 

pointer  to  .config 

.db  file 

*/ 

int  found; 

/* 

boolean,  if  BE  is 

a  valid  one 

*/ 

int  i ; 

/» 

array  index 

•/ 
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if  (pFILE  -  fopen  (".config.db",  “r")) 

/*  .config.db  file  is  oj>en  */ 

fscanf  (pFILE,  "%s",  pch) ; 

/*  initialize  current_version 

should  be  strncpy  (, ,MAX_STRING) ;  */ 
strcpy  (current_version,  pch)  ; 

if  (strcmp  (default_version,  current_version) ) 

{ 

printf  ("VnError:  configure. h  file  has  a  different  version  name  ") ; 
printf  ("than  the  default Xn") ; 

printf  ("\t  char  *default_version  -\"%s\";  ",  default_version) ; 

printf  ("vs.  .config.db:  \"%s\"  \n\n  ",  current_version) ; 
printf  ("XtAny  time  the  version  is  changed,  default_version  in  ")  ; 
printf  ("configure. h  \n\t  must  be  updated. \n") ; 
printf  ("XtOpdate  configure. h  as  needed  and  recompile  main.cXn"); 
printf  ("XtThe  files  are  found  in  %s/bin\n'',  current_version) ; 

/*  return  (ERROR);  not  used  since  exits  later  */ 

} 


while  (fscanf  (pFILE,  "%s",  pch)  !-  EOF) 

( 

/*  I  feof  (pFlLE)  */ 

strcpy  (current_Bac)cEnds[*pN_Bac)cEnds] ,  pch)  ; 
found  -  FALSE; 

for  (i  -  0;  i  <  MAX_BACK_ENDS;  i++) 

if  (strcmp  (pchT  be_names[i])  “  0) 

{ 

found  -  TRUE; 
brea)c; 

) 

if  (found) 

( *pN_Bac)cEnds )  ++ ; 
else  ~ 

printf  ("ERROR,  back  end  name  %s  in  .config.db  invalidXn",  pch); 

)  /*  end  while  »/ 

fclose  (pFILE); 

if  (*pN_Bac)tEnds  “  0) 

return  (default_config  (pN_BackEnds ,  current_BackEnds, current_version) ) 

}  /*  if  fopen  .config.db  */ 

else 

( 

return  (default_config (pN_BackEnds,  current_BackEnds ,  current_version) ) ; 

) 

return  (NO_ERROR) ; 

}  /*  end  get_conf iguration  ()  */ 
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/*  - 

*  ZERO_BACKENDS  calls  a  shell  coimnand  in  pwd  which  calls  zero  on 

*  each  back  end  in  current  configuration 

* 

*  parameters:  n_BackEnds  (in)  number  of  BackEnds  in  current  configuration 

*  current_BackEnds [ ]  list  of  back  end  names  in  current  conf ig 

* 

*  return:  no  information 

*/ 

int  zero_backends  (n_BackEnds,  current_BackEnds) 

int,  n_BackEnds; 

char  *current_BackEnds [ ] ; 

{ 


int  i  -  0;  /*  array  index  */ 

char  pch[MAX_STRING] ;  /*  ten^orary  pointer  to  string  */ 

for  (;  i  <  n_BackEnds;  i++) 

{ 

/*  for  each  back  end  being  used,  call  shell  scripts  to  */ 

/*  zero  meta  and  data  disks  */ 


sprintf  (pch,  "zero.%s",  current_BackEnds [i] )  ; 
system  (pch) ; 

) 

/*  on  controller,  call  shell  script  to  remove  files  not  needed  */ 

sprintf  (pch,  "zero.%s",  controller); 
system  (pch) ; 


/*  - 

*  SET_PATHS  sets  the  path  of  BE_path,  CNTRL_path 

*  parameters:  current_version  (in)  name  of  current  version 

*  BE_path  (out)  complete  path  name  for  directory 

*  containing  executable  BackEnd  code 

*  CNTRL_path  (out)  complete  path  name  for  directory 

*  containing  executable  Control  code 

*  return:  no  information 

*/ 

int  set_paths  (current_version,  BE_path,  CNTRL_path) 
char  *current_version,  *BE_path,  *CNTRI>_path; 

/*  initialize  BE_path  to  /u/mdbs/be. VERSION/  */ 
strcpy  (BE_path,  "/u/mdbs/be."); 
strcat  (BE_path,  current_version) ; 
street  (BE_path,  "/")  ; 

/*  initialize  CNTRL_path  to  /u/mdbs/VERSION/CNTRL/  */ 
strcpy  (CNTRL_path,  ”/u/mdbs/"); 
strcat  (CNTRL__path,  current_version) ; 
street  (CNTRI,_j>ath,  "/CNTRL/"); 
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/*  - 

*  CHANGE_VERSION  change  version  name  and  /  or  copy  executable  code  to 

*  each  working  back  end 


* 

* 

parameters : 

pN_BackEnds 

(in) 

* 

current_BackEnds [ ] 

* 

current  version  (in) 

* 

* 

BE_path 

(in  out) 

* 

* 

CNTRL_path 

(in  out) 

* 

returns : 

YES  if  new 

version 

* 

NO  if  no 

change  to 

*/ 

number  of  BackEnds  in  current 
configuration 

list  of  BackEnd  names  in  current  config 
name  of  current  version 
conplete  path  name  for  directory 
containing  executable  BackEnd  code 
complete  path  name  for  directory 
containing  executable  Control  code 


int  change_version  (pN_BackEnds ,  current_BackEnds ,  current_version, 

BE_path,  CNTRL_path,  pwd) 

int  *pN_BackEnds; 

char  *current_BackEnds [ ] ; 

char  *current_version; 

char  *BE_path; 

char  *CNTRL_path; 

char  *pwd; 

{ 


char  pch[MAX_STRING] ; 

/* 

temporary  pointer  to  string 

*/ 

char  new_version (MAX_STRING] ; 

/* 

new  version-name 

*/ 

int  i; 

/* 

array  index 

*/ 

int  found  •  TRUE; 

/* 

boolean  if  pwd  includes  new  version 

*/ 

FILE  *pFILE; 

/* 

file  pointer  to  .config.db  file 

*/ 

/*  ask  if  you  want  to  change  version  */ 

printf  ("\nWARNING: \t (1)  changing  the  version  should  only  be  done  for  "); 
printf  ("major  changes. \n  "); 

printf  ("\t\t(2)  Please  consult  Dr.  Hsiao  or  all  project  members  first\n"); 
printf  ("\t\t(3)  This  MOST  be  run  from  the  new  version  directoryXn") ; 
printf  ("\t\t  e.g.  from  directory  /u/mdbs/VERSION/run. \n\n") ; 


printf  ("\tpwd  --  %s\n",  pwd); 


printf  ("\nDo  you  still  want  to  change  the  current  version?  (y/n)  "); 

scanf  ("%s",  pch) ; 
printf  ("\n") ; 

if  {*pch  —  'y'  I  1  *pch  —  'Y' ) 

{ 

*pch  -  'n' ; 


while  (  (*pch  --  'n'  II  *pch  ~  'N' )  6«  (*pch  !-  'q'  II  *pch  !-  'Q' )  ) 

{ 

printf  ("\nEnter  the  new  version  name:  ") ; 
scanf  ("%s",  new_version) ; 
printf  ("\n") ; 

printf  ("\nPlease  verify  the  new  version  name  to  be:  %s\n", 

new_version)  ; 

printf  ("\nls  this  correct?  (y/n) (q/Q  to  Quit)  ") ; 
scanf  '("%s",  pch)  ; 
printf  ("\n"); 

} 

if  (*pch  --  'q'  II  *pch  —  'Q' ) 
return  (NO) ; 


strcpy  (current_version,  new_version) ; 

/*  -  should  verify  version  name  valid.  -  */ 

/* - no  characters - */ 

/*  NOT  IMPLEMENTED  */ 
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/*  verify  that  new  version  is  in  the  current  path  */ 

/*  - */ 

if  (not_in_pwd  (pwd,  current_version) ) 

{ 

printf  ("ERROR:  %s  not  found  in  pwd.  \n",  current_version) ; 
printf  ("\t.config.db  should  only  be  updated  in  /mdbs/"); 
printf  ("%s/run  directory. \n\n",  current_version) ; 
printf  O'Xt.config.db  not  updated. \n") ; 
return  (NO) ; 

) 

/* - rewrite  .config.db -  */ 

/* -  */ 

if  (pFILE  “  fopen  (".config.db",  "w")) 

{ 

printf  ("pwd  includes  new  version,  updating  .config.db  \n") ; 
fprintf  (pFILE,  "%s\n",  current_version) ; 

for  (i”0;  i  <  *pN_BaokElnds;  i++) 

( 

fprintf  (pFILE,  "%s\n",  current_BackEnds  [i]); 

) 

f close  (pFILE); 

) 

else 

{ 

printf  ("unable  to  write  to  .config.dbXn") ; 
return  (NO) ; 

) 

/*  change  BE_path  and  CNTRL_path  to  include  new  version  name  */ 
set_paths  (current_version,  BE__path,  CNTRL__path) ; 

/*  edit  master . run . be  file  and  put  in  sample  run.be  file  in  pwd 
to  be  copied  to  each  back  end  */ 

printf  ("\nHave  Back  Ends  been  set  up  for  this  version?  (y/n)  ") ; 
scanf  ("%s",  pch) ; 
printf  ("\n"); 

.if  (*pch  —  'y'  I  I  *pch  —  'Y' ) 

{ 

return  (YES) ; 

} 

sprintf  (pch,  "sed  -e  \"s/VERSION/%s/g\"  master.run.be  >  run.be", 
current_version) ; 
printf  ("EXECOTING:  %s\n",  pch) ; 
system  (pch) ; 

/*  -  make  run.be  executable  after  copying  - 

system  ("chmod  ug+x  run.be");  */ 

sprintf  (pch,  "chmod  ug+x  run.be"); 
printf  ("EXECUTING:  %s\n",  pch) ; 
system  (pch) ; 
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for  (i  -  0;  i  <  MAX_BACK  ENDS;  i++) 

{ 

/*  -  create  new  version  directory  on  each  BackEnd  -  */ 

/*  -  could  test  before  execution  - 

rsh  %s  mkdir  current_version  */ 

sprintf (pch, "rsh  %s  -n  mkdir  be.%s",  be_names(i],  current  version) 
printf  ("EXECOTING:  %s\n'',  pch)  ; 
system  (pch) ; 

/*  -  copy  run.be  to  new  directory  on  each  BackEnd  -  */ 

/*  rep  run.be  %s/u/mdbs/be. %s  */ 

sprintf  (pch,  "rep  run.be  %s:%s  ",  be_names[i],  BE_path) ; 

printf  ("EXECOTING:  %s\n",  pch) ; 
system  (pch) ; 

/*  -  cbpy  6  processes  to  new  directory  on  each  BackEnd  -  */ 

/*  rep  ../BE/*.exe  %s:/u/mdbs/be. %s  */ 
sprintf  (pch,  "rep  /u/mdbs/%s/BE/*.exe  %s:%s  ", 

current_version,  be_names[i],  BE_path) ; 

printf  ("EXECOTING:  %s\n",  pch) ; 
system  (pch) ; 

) 

) 

else 

/*  user  did  not  wish  to  change  current  version  */ 
return  (NO) ; 

return  (YES)  ; 

}  /*  end  change_version  ()  */ 


/* - 

*  RECONFIGOBE  allows  user  to  change  configuration . 

* 

* 

* 

*  parameters : 

* 

* 

*  returns : 

* 

*/ 

int  reconfigure  (pN_BackEnds,  current_BackEnds) 
int  •pN_BackEnds; 
char  *current_BackEnds [ ] ; 

{ 


char  pch (MAX_STR1NG1 ;  /*  temporary  pointer  to  string  */ 

FILE  *pFILE;  /*  pointer  to  .config.db  file  */ 

int  reconfiguring  -  TROE;  /*  boolean,  if  BE  is  a  valid  one  */ 

int  i;  /*  array  index  */ 

/*  NOTE:  if  we  want  to  re  arrange  data  on  back-ends,  we  must  save 
the  old  current_BackEnds[l. .N_BackEnds]  data  to  a  tape 
then  reload  (via  tape  or  file — mass  load)  so  mdbs  will 
automatically  redistribute.  NOT  IMPLEMENTED  ♦/ 


no  illegal  configurations  are  accepted, 
assumes  get_configuration  con^leted  successfully 

pN_BackEnds  (in)  the  number  in  current  Configuration 

current_BackEnds []  strings  of  each  back  end  name 

YES  if  new  version 

NO  if  no  change  to  version 
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/*  -  get  new  configuration  */ 

while  (reconfiguring) 

{ 

printf  {"  Indicate  which  Back  Ends  you  want  in  the  configuration  \n") ; 
printf  ("  by  responding  y/n  as  each  Back  End  is  listed: \n"); 

for  (*pN  BackEnds  -  0,  i=0;  i<  MAX_BACK  ENDS;  i++) 

( 

printf  ("\n\t\t%s  ?  ",  be_names [ i ) ) ; 

scanf  {"%s",  pch) ; 

if  (*pch  —  'y'  I  1  *pch  »=  'Y' ) 

{ 

strcpy  {current_BackEnds [ *pN_BackEnds ) ,  be_names I i ] ) ; 

( *pN_BackEnds ) ++; 

) 

) 

show_conf iguration  (*pN_BackEnds,  current_BackEnds,  current_version) ; 

/* - opportunity  to  confirm  changes - */ 

if  (*pN_BackEnds  ~  0) 

{ 

printf  ("WARNING:  "); 

printf  ("If  you  do  not  specify  any  back  ends,  the  configuration \n") 
printf  ("\t  will  not  be  updated\n\n'') ; 

} 

printf  ("Is  this  configuration  satisfactory?  (y/n)  ") ; 
scanf  ("%s",  pch) ; 
printf  ("\n"); 

if  (*pch  “  'y'  I  I  *pch  =*  'Y' ) 

{ 

reconfiguring  -  FALSE; 

) 

)  /*  end  while  (reconfiguring)  */ 

/* -  write  new  .config.db  only  if  valid - */ 

if  (*pN_BackEnds  —  0) 

./*  reload  old  configuration  since  new  one  is  invalid  */ 
get_configuration  (pN_BackEnds,  current_BackEnds ,  current_version) ; 

else 

if  (pFILE  -  fopen  (".config.db",  "w")) 

( 

fprintf  (pFILE,  "%s\n",  current_version) ; 


) 


for  (i-0;  i  <  *pN_BackEnds;  i++) 

{ 

fprintf  (pFILE,  "%s\n",  current_BackEnds  [i]); 

) 

f close  (pFILE) ; 

) 

else 


( 

printf  ("Error:  unable  to  write  to  .config.dbVn") ; 
printf'  ("\t. config.db  not  updated  with  change  (s) \n\n") ; 
printf  ("NtYou  must  do  one  of  the  following: \n") ; 
printf  ("\t\tchmod  u+w  .config.db  \n\t\trm  . config.dbXn") ; 
printf  ("\tthen  run  this  program  again. \n"); 
return  (ERROR) ; 

) 


/*  -  display  configuration  */ 

/* -  */ 


8how_conf iguration  (*pN_BackEnds,  current_BackEnds ,  current_version) ; 


zero_backend8  ( *pN_BackEnds ,  current_BackEnds ) ; 
return  (NO_ERROR) ; 

/*  end  reconfigure  ()  */ 
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/* - 

*  MAIN  parameters:  command  line  arguments  (count  and  strings) 

* 

*  see  comments  at  beginning  of  file,  and  in  code 

*/ 

main  (argc,  argv,  env) 
int  argc; 
char  *argv[]; 
char  *env [ ] ; 

{ 

int  n_BackEnds  -  0;  /*  number  of  Back  Ends  being  used  */ 

char  pch[MAX_STRING] ;  /*  temporary  pointer  to  string  */ 

char  pwd[MAX_STRlNG] ;  /*  current  directory,  pointer  to  string  */ 

int  i;  /*  array  index  */ 

int  new_ver  =  FALSE;  /*  if  -v  option  used  to  call  this  program  */ 

int  new_config  =  FALSE;  /*  if  -c  option  used  to  call  this  program  */ 

int  dist_code  =  FALSE;  /*  if  -d  option  used  to  call  this  program  */ 

char  current_version  [MAX_STRING] ;  /*  read  in  from  .config.db  file  */ 

char  BE_path[MAX_STRING]  ;  /*  see  set_path  ()  ;  "/u/mdbs/be.VERSION/";  */ 

char  CNTRL_path [MAX_STRING] ;  /*  see  set_path  ();  "/u/mdbs/VERSION/CNTRL/"; */ 

printf  ("\n\nWelcome  to  MDBS,  today  is  "); 
f flush  (stdout) ; 
system  ("date") ; 

printf  ("\n\n"); 


/* -  check  for  command  line  parameters  */ 

if  (check_for_flags  (argc,  argv,  4new_ver,  Snew_config,  4dist_code  ) 
return  (ERROR) ; 

/* -  read  .config.db  */ 


if  (get  configuration  (4n  BackEnds,  current  BackEnds,  current  version)) 

{ 

printf  ("\nError  in  get_configuration.  Exiting  configure  program.  \n") ; 
return  (ERROR) ; 

)  . 

getwd  (pwd)  ; 

if  (not_in_pwd  (pwd,  current_version) ) 

{ 

printf  ("\nWARNING:  version  name  \"%s\"  from  .config.db",  current_version) 
printf  ("\n\t  not  in  current  path:  %s\n",  pwd) ; 

printf  ("\t  MDBS  will  not  run  unless  version  name  is  in  pwd\n"); 

) 

set_paths  (current_version,  BE_path,  CNTRLjpath) ; 

/*  -  display  executable  listings  of  BE  and  CNTRL  */ 

printf  ("\nCheck  the  time  each  file  was  last  compiled: \n\n") ; 

sprintf  (pch,  "Is  -1  . . /BE/* .exe") ; 
system  (pch) ; 

printf  ("\n") ; 

sprintf  (pch,  "Is  -1  .. /CNTRL/* .exe") ; 
system  (pch) ; 

printf  ("\nThere  should  he  12  files  listed,  if  not  you  need  to  recompile. \n"); 
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/★ -  ask  to  compile  if  -d  and  -v  not  used  to  call  main  */ 

/* -  */ 

if  (dist_code  "  FALSE  6i  new_ver  ■>—  FALSE)  { 


printf  (“\nDo  you  need  to  recompile  any  executable  and/or  copy  the  6  ") ; 

printf  (''\n  executable  files  to  each  Back  End  "); 

printf  ("Snlbget,  bput,  cc,  dio,  dirman,  recp.exe)?  (y/n)  "); 

scanf  ("%s'',  pch)  ; 
printf  («\n"); 

) 

if  (*pch  ==■  'y'  I  I  *pch  «•  'Y'  |  |  di3t_code  ) 

printf  ("\nDo  you  need  to  recompile  before  copying  to  the  back  ends?  (y/n)  ") 

scanf  ("%s",  pch) ; 
printf  ("Xn"); 


if  (*pch  --  'y'  I  I  *pch  -=  'Y' ) 

{ 

/* - recompile  executable  code  */ 

/* - */ 

sprintf  (pch,  /bin /make  all"); 

printf  (*EXECOTING;  %s\n",  pch) ; 
system  (pch) ; 


) 

/* - copy  recompiled  code  to  each  back  end  */ 

/*  -  */ 


sprintf  (pch,  "  ") ; 

for  (i  -  0;  i  <  MAX_BACK_EKDS;  i++) 

{ 

/*  for  each  back  end  being  used,  stop  6  processes  on  it 

which  may  still  be  running  from  last  time  */ 

sprintf  (pch,  “rep  ../BE/*.exe  %s:%s",  be  names [i],  BE_path) ; 
printf  (“EXECUTING:  %s\n",  pch) ; 

system  (pch) ; 

} 

}  /*  end  if  recompile  or  recopy  BE  processes  */ 


/* 

/* 

if 


) 


to  change  the  current  version  -v  option  must  be  used 


(new_ver) 

change_version  (Sn_BackEnds,  current_BackEnds, 

cur rent_ver Sion,  BE_path,  CNTRL_path,  pwd) ; 


*/ 

*/ 


show_configuration  (n_BackEnds,  current_BackEnds,  current_version) ; 


/* -  ask  to  reconfigure  if  -c  not  used  */ 

/* - ■ - */ 


if  (new_config  FALSE)  { 

printf  ("\n  WARNING:  All  data  will  be  lost  if  you  reconf igureXn") ; 
printf  ("\n  Do  you  wish  to  reconfigure  the  Back  Ends?  (y/n)  “) ; 
scanf  (“%s",  pch) ; 
printf  (“\n"); 

) 

if  {*pch  "  'y'  I  1  *pch  «-  'Y'  I  1  new_config) 

{ 

if  (reconfigure  (6n_BackEnds,  current_BackEnds)  «=  ERROR) 
return  (ERROR) ; 

} 
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else  /*  - do  not  want  to  reconfigure  back  ends  — 

{ 

/* -  ask  to  use  present  database - 

/* - 

/*  -  display  currently  loaded  database  (pry) 

/* - 


printf  ("\n  Do  you  wish  to  use  current  database?  (y/n)  ") ; 
scanf  ("%s",  pch) ; 
printf  ("\n"); 


*/ 

*/ 

*/ 

*/ 

*/ 


if  (*pch  “  'y'  II  *pch  —  'Y') 

{ 

/*  assumes  that  the  user  knows  what  database  is  loaded  */ 

/*  -  */ 

/*  could  check  for  existing  data  on  disks  here 

printf  ("Checking  what  database  is  on  Back  Ends  is  ") ; 
printf  ("NOT  IMPLEMENTEDXnYou  have  to  remember  or  use  pry\n");*/ 
$ 

) 

else 

{ 

2ero_backends  (n_BackEnds,  current_BackEnds) ; 

) 


}  /*  end  else  of  if  (reconfigure?) 


*/ 


printf  ("\n  Do  you  wish  to  run  the  Multi  Modal,  Multi  Lingual, \n") ; 
printf  ("\t\tMulti  Backended  Database  System?  (y/n)  "); 
scanf  ("%s",  pch); 
printf  ("\n"); 


if  (opch  —  'n'  II  *pch  —  'N') 

( 

return  (NO_ERROR) ; 

) 


/•  -  verify  current_version  in  pwd  - 

/* -  explain  what  to  do  and  ABORT  if  it  is  not 


if  (not_in_pwd  (pwd,  current_version) ) 

{ 

printf  ("\nERROR:  version  name  \"%s\"  from  .config.db", 
printf  ("\n\tis  not  in  current  path:  %s\n\n'',  pwd)  ; 


*/ 

*/ 


current  version) 


printf 

printf 

printf 

printf 

printf 

printf 

printf 


("\tlf  version  name  is  not  correct,  restart  with  -v  option \n") ; 
("\tto  correct  version  narae.\n\n"); 

("\tlf  version  name  is  correct,  then  you  are  running  mainXn"); 
("\tfrom  the  wrong  directory.  You  must  eitherXn") ; 

("\t\trun  main  from  /mdbs/%s/run  or\n",  current_version) ; 
("\t\tchange  begin  /  mdbs  /  start  alias  to  cd  to  /mdbs/"); 
("%s/run\n\n'',  current  version); 


) 


printf  ("\tWhen  changing  versions,  remember  to  verify  default_version") ; 

printf  ("  is  \n\tcorrect  in  configure. h.  If  a  change  is  made,\n"); 

printf  ("\t re compile  main.c  then  copy  main  to  /mdbs/"); 

printf  ("%s/run/main\n\n",  current_version) ; 

printf  ^'NtExiting  configure  program. \n") ; 

return  (ERROR) ; 
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/*  -  verify  current_version  matches  default_version  -  */ 

/* -  explain  what  to  do  and  ABORT  if  no  match  -  */ 

if  (strncmp  (default_version,  current_version,  strlen  (current_version) ) ) 

print  f  ( " \nERROR : " ) ; 

printf  {''\tdefault_version  \"%s\"  does  not  match  \n",  default_version) ; 
printf  ("\tcurrent_version  \"%s\''\n",  current_version)  ; 
printf  r'XtIf  current_version  is  correct, \n") ; 

printf  ("\t\tGo  to  /mdbs/%s/bin  directorySn",  current_version) ; 

printf  ("\t\tln  configure. h  correct  default_version  \n"); 

printf  ("XtXtcompile  main.c  by  typing  'make  main'Xn"); 

printf  {"\t\tmakefile  will  automatically  copy  main  to  run/main\n\n") ; 

printf  ("\tlf  current_version  is  not  correct,  restart  with  -v  optionXn") ; 

printf  r'Xtto  correct  version  name.\n\n"); 

printf  ("XtExiting  configure  program. \n") ; 

return  (ERROR) ; 

) 

/* - kill  all  lingering  processes - 

-  */ 


for  (i  ••  0;  i  <  n_BackEnds;  i++) 

{ 

/*  for  each  back  end  being  used,  stop  6  processes  on  it 

which  may  be  still  running  from  last  time  */ 

sprintf  (pch,  ”stop.%s  »$!  trace",  current_BackEnds [ i ] ) ; 
system  (pch) ; 

} 

/*  ensure  the  6  processes  are  stopped  on  the  controller  */ 

sprintf  (pch,  "stop.%s  »$!  trace",  controller); 
system  (pch) ; 


/*  -  start  5  controller  processes 


printf  ("EXECUTING:  start. cntrlXn",  pch) ; 
system  ("start . cntrl") ; 


*/ 


/*  NOT  IMPLEMENTED:  check  all  5  processes  running  */ 


/* -  start  all  back  end  processes  on  each  back  end  */ 

/*  execute:  "rsh  %s  -n  /u/mdbs /be. greg/run. be"  */ 

/* -  */ 


for  (i  -  0;  i  <  n_BackEnds;  i++) 

( 

sprintf  (pch,  "rsh  %s  -n  %srun.be  S",  current_BackEnds [i] ,  BE_path) ; 
printf  ("EXECUTING:  %s\n",  pch) ; 
system  (pch) ; 

} 

/*  NOT  IMPLEMENTED:  check  all  processes  running  on  each  back  end  */ 


/*  -  start  user  interface  on  (this  machine,  the)  controller  */ 

/*  execute:  ti.exe  'n  BackEnds'  */ 

/* - r -  */ 


sprintf  (pch,  "%s%s  %d",  CNTRL_path,  CNTRL_process  [0],  n_BackEnds) ; 

printf  ("EXECUTING:  %s\n",  pch) ; 
system  (pch) ; 

return  (N0_ERR0R) ; 

)  /*  end  main ()  * / 

/*  eof:  main.c  */ 
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APPENDIX  F.  MISCELLANEOUS  FACTS  ABOUT  USING  MDBS 


A  substantial  portion  of  this  appendix  was  based  on  the  README  file  written  by  John 
Locke,  systems  programmer  at  the  Naval  Postgraduate  School. 

A.  HOW  TO  START-UP  THE  SYSTEM  AFTER  THE  POWER  GOES  OFF. 

1.  For  dbl  -  db9 


Turn  on  monitors  (for  ail  systems) 

Turn  on  dbS  (use  switch  above  tape  drive) 


wait  about  10  seconds 

push  left  of  4  red  horizontal  buttons  (1’  from  floor) 


At  monitor  type 
@  <CR> 


(<CR>  means  carriage  return) 


you  should  not  get  prompted  for  the  date  since  an  internal  battery  exists 


For  each  Back  end  Unit: 

Turn  on  power  switch  centered  on  the  front  top  of  the  machine) 

wait 

Push  START  button  in  vertical  column  of  brown  buttons) 
wait  for  green  light 

Push  left  of  4  red  horizontal  buttons  (T  from  floor) 

At  monitor  type 
@  <CR> 

when  prompted  for  date  (you  only  get  30  seconds,  or  the  shut-down  time  becomes  the 
current  time) 

enter  date  in  format:  YYMMDDHHMM 

These  instructions  are  short  (and  cryptic)  since  most  people  do  not  use  them,  and  the 
typical  person  does  not  want  to  read  long  instructions. 

2.  For  Sun4  Workstations 

Turn  on  power.  If  automatic  reboot  does  not  start,  hold  down  the  STOP  (or 
FI)  button  then  push  the  A  button.  At  the  ‘>’  prompt,  enter  ‘b’  for  boot.  If  the  system  does 
not  reboot,  then  seek  help  from  a  systems  programmer. 
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B.  SOURCE  DIRECTORY  LOCATION 


This  source  distribution  is  in  directory  /u/mdbs/greg  on  host  db8.  The  greg  version  is 
ari  improvement  on  source  from  directory  /u/mdbs/7  on  host  db4.  Version  7  is  protected  by 
having  root  as  the  owner.  To  work  on  the  system  it  is  STRONGLY  recommended  to  make 
a  copy  of  the  whole  distribution  or  at  least  the  file  or  fles  which  you  modify  BEFORE 
making  any  changes.  For  example,  to  copy  the  source  into  a  directory  named  “new”  execute 
this  command  (which  will  work  across  the  network): 

>  rep  -r  c3b8  : /u/mdbs/greg  new 

The  rest  of  these  instructions  will  assume  the  “new”  is  the  new  source.  You  will  want 
to  use  a  more  distinctive  name,  like  your  own.  Expect  the  copying  to  take  about  3  minutes. 

C.  COMPILING 

To  compile  or  recomp"  '  all  or  part  of  MDBS,  change  directory  to  the  bin  directory  of 
the  version  you  need  compiled.  Then  type  “make  all”  to  compile  any  code  which  might 

have  been  modified  ^  This  includes  any  utility  programs  stored  in  any  subdirectory  which 
has  corresponding  lines  in  a  makefile  to  compile  them.  Other  options  for  the  main  makefile 
include  auto,'  clean,  and  dist. 

make  auto  will  compile  only  the  system-generation  program  (aka  configure.h  and 
main.c). 

make  clean  will  remove  all  .o  files 

make  dist  will  compress  all  files  into  /u/mdbs/dist.tar  and  store  that  file  to  tape 

'^^en  you  type  make  all,  a  file  with  shell  commands,  makeall,  is  executed.  This  file 
calls  the  mk  file  in  directories  corresponding  to  each  of  the  12  processes.  Ensure  all 


1.  It  is  possible  that  a  modified  file  will  not  cause  the  makefile  to  recompile  the  associated  process. 
This  may  be  caused  by  a  missing  or  incorrect  dependency  in  the  corresponding  makefile.  You 
should  check  the  make  file  and  correct  the  dependency  (or  force  recompililatior.  by  deleting  the 
object  file  for  the  affected  file  and  some  or  all  of  the  object  and  executable  files  compiled  from  that 
file).  Then  make  the  affected  file. 
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controller  processes  are  not  running  by  using  the  automatic  system  start  up  program  or  by 
using  the  stop.dbS  file  to  kill  them. 

A  successful  compilation  should  look  something  like: 

>  cd  bin 

>  make  all 

Making  the  Configure  Program 
Making  IIG. . . 

Making  CPCL. . . 

Making  PP . . . 

Making  REQP . . . 

Making  TI . . . 

Making  CC . . . 

Making  DIO. . . 

Making  DM. . . 

Making  BPCL. . . 

Making  RECP. . . 

Controller  Processes... 


-irwxrwxr-x 

1 

mdbs 

75398 

May 

21 

15:18 

.  . /CNTRL/cget 

-rwxrwxr-x 

1 

mdbs 

75398 

May 

21 

15:18 

. ./CNTRL/cput 

-rwxrwxr-x 

1 

mdbs 

56657 

May 

21 

15:17 

. ./CNTRL/iig 

-rv7xrwxr-x 

1 

mdbs 

56947 

May 

21 

15:20 

. . /CNTRL/pp 

-rwxrwxr-x 

1 

mdbs 

92551 

May 

21 

15:23 

. . /CNTRL/reqp 

-rwxrwxr-x 

1 

mdbs 

116578 

May 

21 

15:26 

. ./CNTRL/ti 

Backend  Processes... 

-rwxrwxr-x 

1 

mdbs 

75414 

May 

21 

15:36 

. ./BE/bget 

-rwxrwxr-x 

1 

mdbs 

75414 

May 

21 

15:36 

. . /BE/bput 

-rwxrwxr-x 

1 

mdbs 

67607 

May 

21 

15:29 

. . /BE/cc 

-rwxrwxr-x 

1 

mdbs 

46655 

May 

21 

15:30 

. ./BE/dio 

-rwxrwxr-x 

1 

mdbs 

93213 

May 

21 

15:34 

. . /BE/dirman 

-rwxrwxr-x 

1 

mdbs 

108419 

May 

21 

15:39 

. . /BE/recp 

The  output  from  the  several  make  files  are  put  in  either  err  (on  cs4322  VerE.6)  or 
make_result  (on  mdbs  version  7  and  later)  files  corresponding  to  the  mk  file.  The 
make_result  files  can  be  viewed  by  typing: 

>  cd  /u/mdbs/new 

>  more  */*/make_result 

REMEMBER  to  copy  the  back  end  processes  to  the  back  ends  each  time  you 
recompile  since  the  back  end  executables  are  stored  on  the  controller  and  are  not  copied  to 
each  back  end  automatically.  The  automatic  system  startup  program  will  do  the  copying  for 
you  (see  Appendix  B,  paragraph  C).  This  copying  should  be  done  each  recompilation  since 
any  modification  to  an  include  file  could  affect  inter-process  communications.  A  successful 
make  will  change  the  date  and  time  of  the  processes  whose  code  was  modified. 
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D.  COMPILING  AN  INDIVIDUAL  PROCESS 

To  compile  an  individual  process,  change  directory  to  the  source  for  that  process  and 
execute  the  script  “mk”.  “Mk”  will  change  into  the  “Object”  directory,  run  the  makefile 
and,  if  successful,  will  compile  the  updated  process  and  place  it  in  the  CNTRL  or  BE 
directory  as  appropriate.  The  results  of  the  make  will  be  written  to  the  file  “make_result” 
in  the  process  source  directory.  It  should  always  be  read  after  compiling  to  see  what 
actually  happened.  In  the  event  of  an  unsuccessful  compile,  it  will  contain  the  error 
message. 

E.  DEBUGGING  AND  FLAGS 

Most  source  files  have  a  line  near  the  top  to  include  the  file“flags.def 
#include  “flags.def  ’ 

This  causes  the  preprocessor  to  replace  the  #include  line  with  the  contents  of 
“flags.def ’.  An  abbreviated  version  of  “flags.def’; 

/* 

#define  EnExFlag 
#define  msg_pr_flag 
*/ 

Normally,  the  #define  lines  in  “flags.def’  will  be  bracketed  with  the  C  comment 
delimiters.  There  is  one  important  exception,  #define  LangIF_Flag,  which  must  not  be 
commented  out.  If  LangIF_Flag  is  not  defined,  then  you  will  be  unable  to  use  any  language 
other  than  the  attribute  based  data  language.  A  conunent  has  the  effect  of  turning  off  the 
flag(s).  To  turn  one  on,  it  is  necessary  to  edit  “flags.def’  and  move  the  selected  flags  outside 
of  the  comment  delimiters.  This  will  not  force  a  recompile  of  the  entire  process  because 
“flags.def’  is  not  listed  as  a  dependency  for  the  source  files  in  the  makefile.  We  don’t 
always  want  every  process  compiled  with  flags  turned  on.  It  can  create  a  great  deal  of 
debugging  output,  and  we  may  have  isolated  the  problem  area  of  the  code  by  then.  So  after 
editing  “flags.def’,  the  next  step  is  to  remove  the  object  files  for  the  modules  we  want  flags 
turned  on  for.  These  files,  ending  with  .o,  match  the  names  of  the  .c  files  and  are  to  be  found 
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in  the  “Object”  directory  below  the  process  source.  After  removing  the  relevant  .o  files, 
build  the  process  with  “mk”. 

So  what  does  turning  on  a  flag  do?  Say  I  edited  “flags.def  ’  to  be: 

#define  EnExFlag 
/* 

#define  msg_pr_flag 
*! 

The  preprocessor  variable  EnExFlag  is  now  defined  to  a  logical  true.  The  preprocessor 
will  act  conditionally  upon  encountering  structures  like  this: 

#ifdef  EnExFlag 

printf ( 'Enter  func_x\n' ) ; 
f flush (stdout) ; 

#endif 

If  EnExFlag  is  true,  the  code  between  the  #ifdef  and  #endif  lines  will  be  included  in 
the  compiled  object  llie  EnExFlag,  it  so  happens,  is  used  to  show  when  the  entrance  and 
exit  to  functions  occurs  in  the  flow  of  execution.  Other  flags  have  different  purposes-these 
are  documented  in  “flags.def ’. 

To  turn  off  flags  requires  a  simple  reversal  of  the  procedure.  Place  the  #define  lines 
back  between  the  comment  delimiters  in  “flags.def ’,  remove  the  object  files  that  have  flags 
turned  on,  and  recompile  the  process. 

REMEMBER  that  there  are  multiple  copies  of  some  include  files,  particularly 
flags.def.  If  you  make  a  change  to  one,  you  may  or  may  not  want  to  change  the  other  copies 
of  the  files  depending  on  what  stage  of  programming  or  debugging  you  are  in. 

F.  THE  HIGHER  LEVEL  LANGUAGE  INTERFACES 

In  version  7,  the  make  files  are  set  up  to  compile  the  test  interface  (TI)  without  the  four 
higher  level  language  interfaces,  since  not  everyone  works  with  them  and  to  debug  the 
controller-back  end  communications.  To  include  this  code  these  two  steps  must  be 
followed: 

1.  Edit  flags.def  in  TI  to  define  the  LangIF_Flag.  See  the  above  section  on 
debugging  flags. 
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2.  Edit  Tl/makefUe  and  uncomment  the  LIBS  definition  and  the  line  with 
“cd  ../LangIF’.  Both  are  marked  with  comments. 

3.  “Make  all”  in  Version/bin  as  described  above. 

To  turn  off  the  higher  level  language  interfaces,  reverse  the  above  steps  in  the  same 
order  listed. 

G.  CHECKING  THE  HARDWARE  CONFIGURATION 

The  system  is  typically  run  on  a  controller  (usually  dbS)  and  at  least  one  back  end.  The 
controller  requires  no  special  hardware,  but  is  named  in  the  file  COMMON/pcl.def  Change 
this  file  to  use  a  controller  other  than  dbS.  Unlike  dbS,  each  back  end  must  have  two  disk 
drives  to  be  used  as  the  meta  and  data  disks,  /dev/sdlc  and  /dev/smOc,  respectively.  Not  all 
the  ISI’s  have  both  disks  in  working  order.  To  determine  if  the  disks  are  available,  perform 
a  simple  read  operation  on  each,  for  example: 

>  head  /dev/sdlc 

The  prompt  should  return  quickly.  If  the  following  message  prints,  the  disk  is  not 
available  and  the  machine  cannot  be  used  as  a  back  end: 

>  head  /dev/smOc 

/dev/smOc;  No  such  device  or  address 

If  this  message  appears,  the  ownership  of  the  disk  file  must  be  changed  by  a  staff 
member: 

>  head  /dev/sdlc 
/dev/sdlc:  Permission  denied 

at  the  time  of  this  writing,  the  status  of  equipment  is: 

dbl  Pt  Mugu 

db2  SUPSifiP  San  Francisco 

db3  back  end,  working 

db4  back  end,  working 

db5  no  meta  disk 
db6  no  data  disk 
db7  back  end,  working 

dbS  controller,  working 

db9  no  data  disk 


91 


H.  THE  DEVELOPMENT  CYCLE 


This  is  to  give  the  big  picture  of  working  on  the  MDBS  code.  There  are  a  few  steps  to 
the  development  cycle.  Apart  from  the  actual  coding,  all  are  easy  but  it  is  easy  to  forget  a 
step  and  then  be  confused  by  the  results  of  a  run.  The  following  will  provide  a  checklist  of 
steps  to  follow.  It  assumes  that  the  initial  set-up  has  already  been  done  and  the  system  has 
been  successfully  run. 

1.  Edit  code 

2.  Compile  process  with  individual  process“mk”  or  entire  system  “make  all” 

3.  Check  trace  files  for  debugging  output 

4.  If  any  include  file  or  back  end  code  was  modifed,  copy  the  new  back  end 
process(es)  to  the  directory  on  each  back  end  corresponding  to  the  version  name 
(e.g.  be.greg).  This  can  be  done  by  the  system-generation  software  (Appendix  B, 
section  C). 

5.  Run  the  system  (as  described  in  Appendix  B,  section  A). 

L  RUNNING  THE  SYSTEM 

To  start  the  system,  see  Appendix  B.  Once  the  user-interface  process  is  running,  the 
system  might  appear  to  be  up,  though  some  processes  have  died.  This  wiU  usually  be  due 
to  changes  you  have  made  to  the  code  not  working.  For  example,  to  check  the  process  table 
on  a  back  end: 

>  ps  ax  I  grep  *.exe" 

3956  ?  S  0:00  /u/mdbs/be .Version_name/bget . exe 

3957  ?  S  0:00  /u/mdbs/be .Version_naine/dinnan . exe  100 

3958  ?  S  0:00  /u/mdbs/be .Version_name/cc . exe 

3959  ?  S  0:00  /u/mdbs/be. Version_name/recp. exe 

3960  ?  I  0:00  /u/mdbs/be .Version_name/dio . exe 

3961  ?  S  0:00  /u/mdbs/be .Version_name/bput . exe 

The  output  shows  the  six  back  end  processes  running.  The  run  scripts  are  set  to 
channel  their  output,  if  any,  to  files  with  the  name  “proc_name.tr”,  where  proc_name  is  the 
name  of  the  process.  If  a  process  is  compiled  to  produce  debugging  data,  the  data  will  be 
output  to  that  file,  (ti.exe,  the  “test”  user  interface,  is  an  exception— it  outputs  to  the  screen. 
To  save  TI  output,  execute  “script”  before  starting  the  program.) 

Note  that  the  distribution  assumes  use  of  only  one  back  end.  The  number  of  back  ends 
is  given  to  ti  as  an  argument  when  called  by  the  system-generation  program. 
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J.  UTILITY  PROGRAMS 

Several  utility  programs  have  been  written  for  mdbs.  The  executable  file  for  each  is 
usually  kept  in  the  bin  directory  of  the  version.  Location  of  source  is  noted  for  binaries. 


constants  -  prints  out  meta  table  sizes  (source  in  DM) 

cpcount  -  copies  a  file  for  a  given  number  of  bytes;  used  to  copy  meta  and  data  disks 
to  other  back  ends  (source  in  DIO) 

cpydisks  -  shell  script  that  uses  cpcount;  WARNING:  Do  not  run  this  program  from 
the  controller 

disp  -  dynamic  meta  table  disp  (source  in  DM) 

gdb  -  generate  test  database  of  arbitrary  size  from  template  file  (source  in  CNTRL/ 
TV),  usage:  gdb  D.t  D[.r] 

makeaU  -  script  to  make  every  process  (called  by  bin/makefile) 

rectag  -  prints  out  formatted  data  from  data  disk  (source  in  DIO) 

zero  -  writes  all  O’s  to  a  file;  used  to  blank  meta  and  data  disks  (source  in  DIO) 
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APPENDIX  G.  SELECTED  MDBS  FILE  LISTINGS 


A.  CONTROLLER 

The  following  files  are  kept  in  the  ‘run’  directory  for  each  version. 


#  file:  run/master. run.be 

# ! /bin/csh 

echo  Running  backend  on  'hostname'... 

/u/mdbs/be . VERSION/bget . exe 

>&! 

/u/mdbs/be .VERSION/bget . tr  & 

/u/mdbs/be.VERSION/dirman.exe  100 

>&! 

/u/mdbs/be .VERSION/dirman. tr& 

/u/mdbs/be .VERSION/ cc . exe 

>&! 

/u/mdbs/be .VERSION/cc .tr  & 

/u/mdbs/be . VERSION/recp . exe 

>&! 

/u/mdbs/be. VERSION/recp. tr  & 

/u/mdbs/be . VERSION/dio . exe 

>&! 

/u/mdbs/be. VERSION/dio. tr  & 

(sleep  16;  /u/mdbs /be. VERSION/bput. exe 

>&J 

/u/mdbs/be .VERSION/bput . tr ) & 

master.riin.be 


#  file:  run/stop. db4 

#  echo  stopping  processes  on  isiv4 
rsh  db4  /u/mdbs /bin/ stop.cmd 


#  file:  run/stop. db8 


echo  stopping  processes  on  'hostname',  the  controller 

ps  -X  I  awk  -f  .exe.awk  >!  list2.stop 
#  file  .exe.awk  contains:  /exe/  {print  $1} 

set  noclob 

set  a='cat  list2.stop' 
if  ($#a)  then 

echo  killing  $a 
kill  $a 

else 

echo  no  processes  to  kill  on  'hostname' 
endif 

rm  list2,stop 

stop.dbS 


#  file:  run/zero. db8 

#  this  is  only  for  the  controller 
# ! /bin/csh 

echo  Removing  CINBT  and  IIG  AT  tables  on  controller,  'hostname'... 
rm  -f  /u/mdbs/UserFiles/* .cinbt 
rm  -f  /u/mdbs/UserFiles/*. iig.at 
rm  -f  /u/mdbs/UserFiles/ .R* 

rm  -f  /u/mdbs/ . * .pid 

rm  -f  /u/mdbs/greg/run/trace/* . tr 

#  removing  ?  from  Sockets:  copied  from  cs4322  acct 
rm  -f  /u/mdbs/Sockets/G_PCLB 

rm  -f  /u/mdbs/Sockets/G_PCLC 

#  there  is  no  backend  meta  disk  on  db8 

#  there  is  no  backend  disk  on  db8 

zero.db8 
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#  file:  /greg/run/start .cntrl .screen. trace 

#  THIS  FILE  WRITES  ALL  DEBUGGING  STATEMENTS  TO  THE  SCREEN 
# 1 /bin/csh 

echo  starting  5  of  6  controller  processes  on  'hostname'... 

.. /CNTRL /cget .exe  & 

. . /CNTRL/cput . exe  & 

.. /CNTRL /pp. exe  & 

. . /CNTRL/iig . exe  & 

. . /CNTRL/reqp.exe  & 

#  to  be  executed  when  called  by  program  which  called  this  shell  script 

#  output  goes  to  screen 

#  . . /CNTRL/ti .exe  n 


startcntrl 


#  file:  /greg/run/start .cntrl 

#  NORMAL  METHOD  is  to  put  debugging  comments  into  .tr  files 
# ! /bin/csh 

echo  starting  5  of  6  controller  processes  on  'hostname'... 

. ./CNTRL/ cget. exe  >&!  trace/ cget.tr  & 

.. /CNTRL/cput .exe  >&!  trace/cput .tr  & 

.. /CNTRL/ pp. exe  >&!  trace/pp.tr  & 

.. /CNTRL/iig. exe  >&!  trace/iig.tr  & 

../CNTRL/reqp.exe  >&!  trace/reqp.tr  & 

#  to  be  executed  when  called  by  program  which  called  this  shell  script 

#  output  goes  to  screen 

#  .. /CNTRL/ti .exe  n 


Startcntrl 


96 


Two  files  modified  for  Sun  csh 


#  file:  run/ stop. dbll 

#  modified  930427  for  compatability  with  Sun  system, 
echo  stopping  processes  on  'hostname',  the  controller 

#  !/bin/csh 

ps  -X  1  awk  -f  .exe.awk  >  list2.stop 

#  file  .exe.awk  contains:  /\.exe/  {print  $1) 

set  noclob 

foreach  a  ('cat  list2.stop') 
echo  killing  $a 
kill  $a 

end 

rm  list2.stop 


stop.dbll 


#  file:  /greg/run/start .cntrl 

#  modified  930512  for  compatability  with  Sun. 

# ! /bin/csh 
rm  trace/*. tr 

echo  starting  5  of  6  controller  processes  on  'hostname'... 

.. /CNTRL/ cget .exe  >  trace/cget .tr  & 

.. /CNTRL/ cput .exe  >  trace/cput .tr  & 

.. /CNTRL /pp. exe  >  trace/pp.tr  & 

.. /CNTRL/ iig. exe  >  trace/ iig.tr  & 

. . /CNTRL/reqp. exe  >  trace/reqp .tr  & 

#  to  be  executed  when  called  by  program  which  called  this  shell  script 

#  output  goes  to  screen 

#  . ./CNTRL/ti.exe  n 


startcntrl 
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B.  BACK  ENDS 


The  following  file  is  stored  on  each  back  end  in  the  ‘bin’  directory. 


#  file:  binstop.cmd 

e-ho  stopping  processes  on  back  end  'hostname' 

ps  -X  I  awk  -f  bin/exe.awk  >!  list2.stop 

#  file  exe.awk  contains:  /exe/  {print  $1) 

set  noclob 

set  a='cat  list2.stop' 
if  ($#a  !=  0)  then 
echo  killing  $a 
kill  $a 

#  for  some  reason,  this  won't  work  on  last  line  >$ !  stop. trace 

else 

echo  no  processes  to  kill  on  'hostname' 
end  if 

rm  list2.stop 


stop.cmd 

The  following  file  is  stored  on  each  back  end  in  the  ‘be. VERSION’  directory. 


# ! /bin/csh 

echo  Running  backend  on  'hostname'... 

/u/mdbs/be .greg/bget .exe 

>& 

/u/mdbs/be .greg/bget .tr  & 

/u/mdbs/be.greg/dirman.exe  100 

>& 

/u/mdbs/be . greg/dirman . tr& 

/u/mdbs/be . greg/cc . exe 

>& 

/u/mdbs/be .greg/cc . tr  & 

/u/mdbs/be .greg/recp . exe 

>& 

/u/mdbs /be. greg/recp. tr  & 

/u/mdbs /be . greg/dio . exe 

>& 

/u/mdbs /be. greg/dio. tr  & 

Asleep  16;  /u/mdbs /be. greg/bput. exe 

>& 

/u/mdbs/be .greg/bput . tr)  & 

run.be 
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